Reading view

There are new articles available, click to refresh the page.

HAProxy EP 9: Load Balancing with Weighted Round Robin

Load balancing helps distribute client requests across multiple servers to ensure high availability, performance, and reliability. Weighted Round Robin Load Balancing is an extension of the round-robin algorithm, where each server is assigned a weight based on its capacity or performance capabilities. This approach ensures that more powerful servers handle more traffic, resulting in a more efficient distribution of the load.

What is Weighted Round Robin Load Balancing?

Weighted Round Robin Load Balancing assigns a weight to each server. The weight determines how many requests each server should handle relative to the others. Servers with higher weights receive more requests compared to those with lower weights. This method is useful when backend servers have different processing capabilities or resources.

Step-by-Step Implementation with Docker

Step 1: Create Dockerfiles for Each Flask Application

We’ll use the same three Flask applications (app1.py, app2.py, and app3.py) as in previous examples.

  • Flask App 1 (app1.py):

from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello from Flask App 1!"

@app.route("/data")
def data():
    return "Data from Flask App 1!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5001)

  • Flask App 2 (app2.py):

from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello from Flask App 2!"

@app.route("/data")
def data():
    return "Data from Flask App 2!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5002)

  • Flask App 3 (app3.py):

from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello from Flask App 3!"

@app.route("/data")
def data():
    return "Data from Flask App 3!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5003)

Step 2: Create Dockerfiles for Each Flask Application

Create Dockerfiles for each of the Flask applications:

  • Dockerfile for Flask App 1 (Dockerfile.app1):

# Use the official Python image from Docker Hub
FROM python:3.9-slim

# Set the working directory inside the container
WORKDIR /app

# Copy the application file into the container
COPY app1.py .

# Install Flask inside the container
RUN pip install Flask

# Expose the port the app runs on
EXPOSE 5001

# Run the application
CMD ["python", "app1.py"]

  • Dockerfile for Flask App 2 (Dockerfile.app2):

FROM python:3.9-slim
WORKDIR /app
COPY app2.py .
RUN pip install Flask
EXPOSE 5002
CMD ["python", "app2.py"]

  • Dockerfile for Flask App 3 (Dockerfile.app3):

FROM python:3.9-slim
WORKDIR /app
COPY app3.py .
RUN pip install Flask
EXPOSE 5003
CMD ["python", "app3.py"]

Step 3: Create the HAProxy Configuration File

Create an HAProxy configuration file (haproxy.cfg) to implement Weighted Round Robin Load Balancing


global
    log stdout format raw local0
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend http_front
    bind *:80
    default_backend servers

backend servers
    balance roundrobin
    server server1 app1:5001 weight 2 check
    server server2 app2:5002 weight 1 check
    server server3 app3:5003 weight 3 check

Explanation:

  • The balance roundrobin directive tells HAProxy to use the Round Robin load balancing algorithm.
  • The weight option for each server specifies the weight associated with each server:
    • server1 (App 1) has a weight of 2.
    • server2 (App 2) has a weight of 1.
    • server3 (App 3) has a weight of 3.
  • Requests will be distributed based on these weights: App 3 will receive the most requests, App 2 the least, and App 1 will be in between.

Step 4: Create a Dockerfile for HAProxy

Create a Dockerfile for HAProxy (Dockerfile.haproxy):


# Use the official HAProxy image from Docker Hub
FROM haproxy:latest

# Copy the custom HAProxy configuration file into the container
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

# Expose the port for HAProxy
EXPOSE 80

Step 5: Create a docker-compose.yml File

To manage all the containers together, create a docker-compose.yml file

version: '3'

services:
  app1:
    build:
      context: .
      dockerfile: Dockerfile.app1
    container_name: flask_app1
    ports:
      - "5001:5001"

  app2:
    build:
      context: .
      dockerfile: Dockerfile.app2
    container_name: flask_app2
    ports:
      - "5002:5002"

  app3:
    build:
      context: .
      dockerfile: Dockerfile.app3
    container_name: flask_app3
    ports:
      - "5003:5003"

  haproxy:
    build:
      context: .
      dockerfile: Dockerfile.haproxy
    container_name: haproxy
    ports:
      - "80:80"
    depends_on:
      - app1
      - app2
      - app3


Explanation:

  • The docker-compose.yml file defines the services (app1, app2, app3, and haproxy) and their respective configurations.
  • HAProxy depends on the three Flask applications to be up and running before it starts.

Step 6: Build and Run the Docker Containers

Run the following command to build and start all the containers


docker-compose up --build

This command builds Docker images for all three Flask apps and HAProxy, then starts them.

Step 7: Test the Load Balancer

Open your browser or use curl to make requests to the HAProxy server


curl http://localhost/
curl http://localhost/data

Observation:

  • With Weighted Round Robin Load Balancing, you should see that requests are distributed according to the weights specified in the HAProxy configuration.
  • For example, App 3 should receive three times more requests than App 2, and App 1 should receive twice as many as App 2.

Conclusion

By implementing Weighted Round Robin Load Balancing with HAProxy, you can distribute traffic more effectively according to the capacity or performance of each backend server. This approach helps optimize resource utilization and ensures a balanced load across servers.

HAProxy EP 8: Load Balancing with Random Load Balancing

Load balancing distributes client requests across multiple servers to ensure high availability and reliability. One of the simplest load balancing algorithms is Random Load Balancing, which selects a backend server randomly for each client request.

Although this approach does not consider server load or other metrics, it can be effective for less critical applications or when the goal is to achieve simplicity.

What is Random Load Balancing?

Random Load Balancing assigns incoming requests to a randomly chosen server from the available pool of servers. This method is straightforward and ensures that requests are distributed in a non-deterministic manner, which may work well for environments with equally capable servers and minimal concerns about server load or state.

Step-by-Step Implementation with Docker

Step 1: Create Dockerfiles for Each Flask Application

We’ll use the same three Flask applications (app1.py, app2.py, and app3.py) as in previous examples.

Flask App 1 – (app.py)

from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello from Flask App 1!"

@app.route("/data")
def data():
    return "Data from Flask App 1!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5001)


Flask App 2 – (app.py)


from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello from Flask App 2!"

@app.route("/data")
def data():
    return "Data from Flask App 2!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5002)

Flask App 3 – (app.py)

from flask import Flask

app = Flask(__name__)

@app.route("/")
def home():
    return "Hello from Flask App 3!"

@app.route("/data")
def data():
    return "Data from Flask App 3!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5003)


Step 2: Create Dockerfiles for Each Flask Application

Create Dockerfiles for each of the Flask applications:

  • Dockerfile for Flask App 1 (Dockerfile.app1):
# Use the official Python image from Docker Hub
FROM python:3.9-slim

# Set the working directory inside the container
WORKDIR /app

# Copy the application file into the container
COPY app1.py .

# Install Flask inside the container
RUN pip install Flask

# Expose the port the app runs on
EXPOSE 5001

# Run the application
CMD ["python", "app1.py"]

  • Dockerfile for Flask App 2 (Dockerfile.app2):
FROM python:3.9-slim
WORKDIR /app
COPY app2.py .
RUN pip install Flask
EXPOSE 5002
CMD ["python", "app2.py"]


  • Dockerfile for Flask App 3 (Dockerfile.app3):

FROM python:3.9-slim
WORKDIR /app
COPY app3.py .
RUN pip install Flask
EXPOSE 5003
CMD ["python", "app3.py"]

Step 3: Create a Dockerfile for HAProxy

HAProxy Configuration file,


global
    log stdout format raw local0
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend http_front
    bind *:80
    default_backend servers

backend servers
    balance random
    random draw 2
    server server1 app1:5001 check
    server server2 app2:5002 check
    server server3 app3:5003 check

Explanation:

  • The balance random directive tells HAProxy to use the Random load balancing algorithm.
  • The random draw 2 setting makes HAProxy select 2 servers randomly and choose the one with the least number of connections. This adds a bit of load awareness to the random choice.
  • The server directives define the backend servers and their ports.

Step 4: Create a Dockerfile for HAProxy

Create a Dockerfile for HAProxy (Dockerfile.haproxy):

# Use the official HAProxy image from Docker Hub
FROM haproxy:latest

# Copy the custom HAProxy configuration file into the container
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

# Expose the port for HAProxy
EXPOSE 80


Step 5: Create a docker-compose.yml File

To manage all the containers together, create a docker-compose.yml file:


version: '3'

services:
  app1:
    build:
      context: .
      dockerfile: Dockerfile.app1
    container_name: flask_app1
    ports:
      - "5001:5001"

  app2:
    build:
      context: .
      dockerfile: Dockerfile.app2
    container_name: flask_app2
    ports:
      - "5002:5002"

  app3:
    build:
      context: .
      dockerfile: Dockerfile.app3
    container_name: flask_app3
    ports:
      - "5003:5003"

  haproxy:
    build:
      context: .
      dockerfile: Dockerfile.haproxy
    container_name: haproxy
    ports:
      - "80:80"
    depends_on:
      - app1
      - app2
      - app3

Explanation:

  • The docker-compose.yml file defines the services (app1, app2, app3, and haproxy) and their respective configurations.
  • HAProxy depends on the three Flask applications to be up and running before it starts.

Step 6: Build and Run the Docker Containers

Run the following command to build and start all the containers:


docker-compose up --build

This command builds Docker images for all three Flask apps and HAProxy, then starts them.

Step 7: Test the Load Balancer

Open your browser or use curl to make requests to the HAProxy server:

curl http://localhost/
curl http://localhost/data

Observation:

  • With Random Load Balancing, each request should randomly hit one of the three backend servers.
  • Since the selection is random, you may not see a predictable pattern; however, the requests should be evenly distributed across the servers over a large number of requests.

Conclusion

By implementing Random Load Balancing with HAProxy, we’ve demonstrated a simple way to distribute traffic across multiple servers without relying on complex metrics or state information. While this approach may not be ideal for all use cases, it can be useful in scenarios where simplicity is more valuable than fine-tuned load distribution.

HAProxy EP 7: Load Balancing with Source IP Hash, URI – Consistent Hashing

Load balancing helps distribute traffic across multiple servers, enhancing performance and reliability. One common strategy is Source IP Hash load balancing, which ensures that requests from the same client IP are consistently directed to the same server.

This method is particularly useful for applications requiring session persistence, such as shopping carts or user sessions. In this blog, we’ll implement Source IP Hash load balancing using Flask and HAProxy, all within Docker containers.

What is Source IP Hash Load Balancing?

Source IP Hash Load Balancing is a technique that uses a hash function on the client’s IP address to determine which server should handle the request. This guarantees that a particular client will always be directed to the same backend server, ensuring session persistence and stateful behavior.

Consistent Hashing: https://parottasalna.com/2024/06/17/why-do-we-need-to-maintain-same-hash-in-load-balancer/

Step-by-Step Implementation with Docker

Step 1: Create Flask Application

We’ll create three separate Dockerfiles, one for each Flask app.

Flask App 1 (app1.py)

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Flask App 1!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5001)


Flask App 2 (app2.py)

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Flask App 2!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5002)


Flask App 3 (app3.py)

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Flask App 3!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5003)

Each Flask app listens on a different port (5001, 5002, 5003).

Step 2: Dockerfiles for each flask application

Dockerfile for Flask App 1 (Dockerfile.app1)

# Use the official Python image from the Docker Hub
FROM python:3.9-slim

# Set the working directory inside the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY app1.py .

# Install Flask inside the container
RUN pip install Flask

# Expose the port the app runs on
EXPOSE 5001

# Run the application
CMD ["python", "app1.py"]

Dockerfile for Flask App 2 (Dockerfile.app2)

FROM python:3.9-slim
WORKDIR /app
COPY app2.py .
RUN pip install Flask
EXPOSE 5002
CMD ["python", "app2.py"]

Dockerfile for Flask App 3 (Dockerfile.app3)

FROM python:3.9-slim
WORKDIR /app
COPY app3.py .
RUN pip install Flask
EXPOSE 5003
CMD ["python", "app3.py"]

Step 3: Create a configuration for HAProxy

global
    log stdout format raw local0
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend http_front
    bind *:80
    default_backend servers

backend servers
    balance source
    hash-type consistent
    server server1 app1:5001 check
    server server2 app2:5002 check
    server server3 app3:5003 check

Explanation:

  • The balance source directive tells HAProxy to use Source IP Hashing as the load balancing algorithm.
  • The hash-type consistent directive ensures consistent hashing, which is essential for minimizing disruption when backend servers are added or removed.
  • The server directives define the backend servers and their ports.

Step 4: Create a Dockerfile for HAProxy

Create a Dockerfile for HAProxy (Dockerfile.haproxy)

# Use the official HAProxy image from Docker Hub
FROM haproxy:latest

# Copy the custom HAProxy configuration file into the container
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

# Expose the port for HAProxy
EXPOSE 80

Step 5: Create a Dockercompose file

To manage all the containers together, create a docker-compose.yml file

version: '3'

services:
  app1:
    build:
      context: .
      dockerfile: Dockerfile.app1
    container_name: flask_app1
    ports:
      - "5001:5001"

  app2:
    build:
      context: .
      dockerfile: Dockerfile.app2
    container_name: flask_app2
    ports:
      - "5002:5002"

  app3:
    build:
      context: .
      dockerfile: Dockerfile.app3
    container_name: flask_app3
    ports:
      - "5003:5003"

  haproxy:
    build:
      context: .
      dockerfile: Dockerfile.haproxy
    container_name: haproxy
    ports:
      - "80:80"
    depends_on:
      - app1
      - app2
      - app3

Explanation:

  • The docker-compose.yml file defines four services: app1, app2, app3, and haproxy.
  • Each Flask app is built from its respective Dockerfile and runs on its port.
  • HAProxy is configured to wait (depends_on) for all three Flask apps to be up and running.

Step 6: Build and Run the Docker Containers

Run the following commands to build and start all the containers:

# Build and run the containers
docker-compose up --build

This command will build Docker images for all three Flask apps and HAProxy and start them up in the background.

Step 7: Test the Load Balancer

Open your browser or use a tool like curl to make requests to the HAProxy server:

curl http://localhost

Observation:

  • With Source IP Hash load balancing, each unique IP address (e.g., your local IP) should always be directed to the same backend server.
  • If you access the HAProxy from different IPs (e.g., using different devices or by simulating different client IPs), you will see that requests are consistently sent to the same server for each IP.

For the URI based hashing we just need to add,

global
    log stdout format raw local0
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend http_front
    bind *:80
    default_backend servers

backend servers
    balance uri
    hash-type consistent
    server server1 app1:5001 check
    server server2 app2:5002 check
    server server3 app3:5003 check


Explanation:

  • The balance uri directive tells HAProxy to use URI Hashing as the load balancing algorithm.
  • The hash-type consistent directive ensures consistent hashing to minimize disruption when backend servers are added or removed.
  • The server directives define the backend servers and their ports.

HAProxy Ep 6: Load Balancing With Least Connection

Load balancing is crucial for distributing incoming network traffic across multiple servers, ensuring optimal resource utilization and improving application performance. One of the simplest and most popular load balancing algorithms is Round Robin. In this blog, we’ll explore how to implement Least Connection load balancing using Flask as our backend application and HAProxy as our load balancer.

What is Least Connection Load Balancing?

Least Connection Load Balancing is a dynamic algorithm that distributes requests to the server with the fewest active connections at any given time. This method ensures that servers with lighter loads receive more requests, preventing any single server from becoming a bottleneck.

Step-by-Step Implementation with Docker

Step 1: Create Dockerfiles for Each Flask Application

We’ll create three separate Dockerfiles, one for each Flask app.

Flask App 1 (app1.py) – Introduced Slowness by adding sleep

from flask import Flask
import time

app = Flask(__name__)

@app.route("/")
def hello():
    time.sleep(5)
    return "Hello from Flask App 1!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5001)


Flask App 2 (app2.py)

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Flask App 2!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5002)


Flask App 3 (app3.py) – Introduced Slowness by adding sleep.

from flask import Flask
import time

app = Flask(__name__)

@app.route("/")
def hello():
    time.sleep(5)
    return "Hello from Flask App 3!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5003)

Each Flask app listens on a different port (5001, 5002, 5003).

Step 2: Dockerfiles for each flask application

Dockerfile for Flask App 1 (Dockerfile.app1)

# Use the official Python image from the Docker Hub
FROM python:3.9-slim

# Set the working directory inside the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY app1.py .

# Install Flask inside the container
RUN pip install Flask

# Expose the port the app runs on
EXPOSE 5001

# Run the application
CMD ["python", "app1.py"]

Dockerfile for Flask App 2 (Dockerfile.app2)

FROM python:3.9-slim
WORKDIR /app
COPY app2.py .
RUN pip install Flask
EXPOSE 5002
CMD ["python", "app2.py"]

Dockerfile for Flask App 3 (Dockerfile.app3)

FROM python:3.9-slim
WORKDIR /app
COPY app3.py .
RUN pip install Flask
EXPOSE 5003
CMD ["python", "app3.py"]

Step 3: Create a configuration for HAProxy

global
    log stdout format raw local0
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend http_front
    bind *:80
    default_backend servers

backend servers
    balance leastconn
    server server1 app1:5001 check
    server server2 app2:5002 check
    server server3 app3:5003 check

Explanation:

  • frontend http_front: Defines the entry point for incoming traffic. It listens on port 80.
  • backend servers: Specifies the servers HAProxy will distribute traffic evenly the three Flask apps (app1, app2, app3). The balance leastconn directive sets the Least Connection for load balancing.
  • server directives: Lists the backend servers with their IP addresses and ports. The check option allows HAProxy to monitor the health of each server.

Step 4: Create a Dockerfile for HAProxy

Create a Dockerfile for HAProxy (Dockerfile.haproxy)

# Use the official HAProxy image from Docker Hub
FROM haproxy:latest

# Copy the custom HAProxy configuration file into the container
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

# Expose the port for HAProxy
EXPOSE 80

Step 5: Create a Dockercompose file

To manage all the containers together, create a docker-compose.yml file

version: '3'

services:
  app1:
    build:
      context: .
      dockerfile: Dockerfile.app1
    container_name: flask_app1
    ports:
      - "5001:5001"

  app2:
    build:
      context: .
      dockerfile: Dockerfile.app2
    container_name: flask_app2
    ports:
      - "5002:5002"

  app3:
    build:
      context: .
      dockerfile: Dockerfile.app3
    container_name: flask_app3
    ports:
      - "5003:5003"

  haproxy:
    build:
      context: .
      dockerfile: Dockerfile.haproxy
    container_name: haproxy
    ports:
      - "80:80"
    depends_on:
      - app1
      - app2
      - app3

Explanation:

  • The docker-compose.yml file defines four services: app1, app2, app3, and haproxy.
  • Each Flask app is built from its respective Dockerfile and runs on its port.
  • HAProxy is configured to wait (depends_on) for all three Flask apps to be up and running.

Step 6: Build and Run the Docker Containers

Run the following commands to build and start all the containers:

# Build and run the containers
docker-compose up --build

This command will build Docker images for all three Flask apps and HAProxy and start them up in the background.

You should see the responses alternating between “Hello from Flask App 1!”, “Hello from Flask App 2!”, and “Hello from Flask App 3!” as HAProxy uses the Round Robin algorithm to distribute requests.

Step 7: Test the Load Balancer

Open your browser or use a tool like curl to make requests to the HAProxy server:

curl http://localhost

You should see responses cycling between “Hello from Flask App 1!”, “Hello from Flask App 2!”, and “Hello from Flask App 3!” according to the Least Connection strategy.

HAProxy EP 5: Load Balancing With Round Robin

Load balancing is crucial for distributing incoming network traffic across multiple servers, ensuring optimal resource utilization and improving application performance. One of the simplest and most popular load balancing algorithms is Round Robin. In this blog, we’ll explore how to implement Round Robin load balancing using Flask as our backend application and HAProxy as our load balancer.

What is Round Robin Load Balancing?

Round Robin load balancing works by distributing incoming requests sequentially across a group of servers.

For example, the first request goes to Server A, the second to Server B, the third to Server C, and so on. Once all servers have received a request, the cycle repeats. This algorithm is simple and works well when all servers have similar capabilities.

Step-by-Step Implementation with Docker

Step 1: Create Dockerfiles for Each Flask Application

We’ll create three separate Dockerfiles, one for each Flask app.

Flask App 1 (app1.py)

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Flask App 1!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5001)


Flask App 2 (app2.py)

from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Flask App 2!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5002)


Flask App 3 (app3.py)


from flask import Flask

app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Flask App 3!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5003)

Each Flask app listens on a different port (5001, 5002, 5003).

Step 2: Dockerfiles for each flask application

Dockerfile for Flask App 1 (Dockerfile.app1)


# Use the official Python image from the Docker Hub
FROM python:3.9-slim

# Set the working directory inside the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY app1.py .

# Install Flask inside the container
RUN pip install Flask

# Expose the port the app runs on
EXPOSE 5001

# Run the application
CMD ["python", "app1.py"]

Dockerfile for Flask App 2 (Dockerfile.app2)


FROM python:3.9-slim
WORKDIR /app
COPY app2.py .
RUN pip install Flask
EXPOSE 5002
CMD ["python", "app2.py"]

Dockerfile for Flask App 3 (Dockerfile.app3)


FROM python:3.9-slim
WORKDIR /app
COPY app3.py .
RUN pip install Flask
EXPOSE 5003
CMD ["python", "app3.py"]

Step 3: Create a configuration for HAProxy

global
    log stdout format raw local0
    daemon

defaults
    log     global
    mode    http
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend http_front
    bind *:80
    default_backend servers

backend servers
    balance roundrobin
    server server1 app1:5001 check
    server server2 app2:5002 check
    server server3 app3:5003 check

Explanation:

  • frontend http_front: Defines the entry point for incoming traffic. It listens on port 80.
  • backend servers: Specifies the servers HAProxy will distribute traffic evenly the three Flask apps (app1, app2, app3). The balance roundrobin directive sets the Round Robin algorithm for load balancing.
  • server directives: Lists the backend servers with their IP addresses and ports. The check option allows HAProxy to monitor the health of each server.

Step 4: Create a Dockerfile for HAProxy

Create a Dockerfile for HAProxy (Dockerfile.haproxy)


# Use the official HAProxy image from Docker Hub
FROM haproxy:latest

# Copy the custom HAProxy configuration file into the container
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

# Expose the port for HAProxy
EXPOSE 80

Step 5: Create a Dockercompose file

To manage all the containers together, create a docker-compose.yml file


version: '3'

services:
  app1:
    build:
      context: .
      dockerfile: Dockerfile.app1
    container_name: flask_app1
    ports:
      - "5001:5001"

  app2:
    build:
      context: .
      dockerfile: Dockerfile.app2
    container_name: flask_app2
    ports:
      - "5002:5002"

  app3:
    build:
      context: .
      dockerfile: Dockerfile.app3
    container_name: flask_app3
    ports:
      - "5003:5003"

  haproxy:
    build:
      context: .
      dockerfile: Dockerfile.haproxy
    container_name: haproxy
    ports:
      - "80:80"
    depends_on:
      - app1
      - app2
      - app3

Explanation:

  • The docker-compose.yml file defines four services: app1, app2, app3, and haproxy.
  • Each Flask app is built from its respective Dockerfile and runs on its port.
  • HAProxy is configured to wait (depends_on) for all three Flask apps to be up and running.

Step 6: Build and Run the Docker Containers

Run the following commands to build and start all the containers:


# Build and run the containers
docker-compose up --build

This command will build Docker images for all three Flask apps and HAProxy and start them up in the background.

You should see the responses alternating between “Hello from Flask App 1!”, “Hello from Flask App 2!”, and “Hello from Flask App 3!” as HAProxy uses the Round Robin algorithm to distribute requests.

Step 7: Test the Load Balancer

Open your browser or use a tool like curl to make requests to the HAProxy server:


curl http://localhost

HAProxy EP 4: Understanding ACL – Access Control List

Imagine you are managing a busy highway with multiple lanes, and you want to direct specific types of vehicles to particular lanes: trucks to one lane, cars to another, and motorcycles to yet another. In the world of web traffic, this is similar to what Access Control Lists (ACLs) in HAProxy do—they help you direct incoming requests based on specific criteria.

Let’s dive into what ACLs are in HAProxy, why they are essential, and how you can use them effectively with some practical examples.

What are ACLs in HAProxy?

Access Control Lists (ACLs) in HAProxy are rules or conditions that allow you to define patterns to match incoming requests. These rules help you make decisions about how to route or manage traffic within your infrastructure.

Think of ACLs as powerful filters or guards that analyze incoming HTTP requests based on headers, IP addresses, URL paths, or other attributes. By defining ACLs, you can control how requests are handled—for example, sending specific traffic to different backends, applying security rules, or denying access under certain conditions.

Why Use ACLs in HAProxy?

Using ACLs offers several advantages:

  1. Granular Control Over Traffic: You can filter and route traffic based on very specific criteria, such as the content of HTTP headers, cookies, or request methods.
  2. Security: ACLs can block unwanted traffic, enforce security policies, and prevent malicious access.
  3. Performance Optimization: By directing traffic to specific servers optimized for certain types of content, ACLs can help balance the load and improve performance.
  4. Flexibility and Scalability: ACLs allow dynamic adaptation to changing traffic patterns or new requirements without significant changes to your infrastructure.

How ACLs Work in HAProxy

ACLs in HAProxy are defined in the configuration file (haproxy.cfg). The syntax is straightforward


acl <name> <criteria>
  • <name>: The name you give to your ACL rule, which you will use to reference it in further configuration.
  • <criteria>: The condition or match pattern, such as a path, header, method, or IP address.

It either returns True or False.

Examples of ACLs in HAProxy

Let’s look at some practical examples to understand how ACLs work.

Example 1: Routing Traffic Based on URL Path

Suppose you have a web application that serves both static and dynamic content. You want to route all requests for static files (like images, CSS, and JavaScript) to a server optimized for static content, while all other requests should go to a dynamic content server.

Configuration:


frontend http_front
    bind *:80
    acl is_static path_beg /static
    use_backend static_backend if is_static
    default_backend dynamic_backend

backend static_backend
    server static1 127.0.0.1:5001 check

backend dynamic_backend
    server dynamic1 127.0.0.1:5002 check

  • ACL Definition: acl is_static path_beg /static : checks if the request URL starts with /static.
  • Usage: use_backend static_backend if is_static routes the traffic to the static_backend if the ACL is_static matches. All other requests are routed to the dynamic_backend.

Example 2: Blocking Traffic from Specific IP Addresses

Let’s say you want to block traffic from a range of IP addresses that are known to be malicious.

Configurations

frontend http_front
    bind *:80
    acl block_ip src 192.168.1.0/24
    http-request deny if block_ip
    default_backend web_backend

backend web_backend
    server web1 127.0.0.1:5003 check


ACL Definition:acl block_ip src 192.168.1.0/24 defines an ACL that matches any source IP from the range 192.168.1.0/24.

Usage:http-request deny if block_ip denies the request if it matches the block_ip ACL.

Example 4: Redirecting Traffic Based on Request Method

You might want to redirect all POST requests to a different backend for further processing.

Configurations


frontend http_front
    bind *:80
    acl is_post_method method POST
    use_backend post_backend if is_post_method
    default_backend general_backend

backend post_backend
    server post1 127.0.0.1:5006 check

backend general_backend
    server general1 127.0.0.1:5007 check

Example 5: Redirect Traffic Based on User Agent

Imagine you want to serve a different version of your website to mobile users versus desktop users. You can achieve this by using ACLs that check the User-Agent header in the HTTP request.

Configuration:


frontend http_front
    bind *:80
    acl is_mobile_user_agent req.hdr(User-Agent) -i -m sub Mobile
    use_backend mobile_backend if is_mobile_user_agent
    default_backend desktop_backend

backend mobile_backend
    server mobile1 127.0.0.1:5008 check

backend desktop_backend
    server desktop1 127.0.0.1:5009 check

ACL Definition:acl is_mobile_user_agent req.hdr(User-Agent) -i -m sub Mobile checks if the User-Agent header contains the substring "Mobile" (case-insensitive).

Usage:use_backend mobile_backend if is_mobile_user_agent directs mobile users to mobile_backend and all other users to desktop_backend.

Example 6: Restrict Access to Admin Pages by IP Address

Let’s say you want to allow access to the /admin page only from a specific IP address or range, such as your company’s internal network.


frontend http_front
    bind *:80
    acl is_admin_path path_beg /admin
    acl is_internal_network src 192.168.10.0/24
    http-request deny if is_admin_path !is_internal_network
    default_backend web_backend

backend web_backend
    server web1 127.0.0.1:5015 check

Example with a Flask Application

Let’s see how you can use ACLs with a Flask application to enforce different rules.

Flask Application Setup

You have two Flask apps: app1.py for general requests and app2.py for special requests like form submissions.

app1.py

from flask import Flask

app = Flask(__name__)

@app.route('/')
def index():
    return "Welcome to the main page!"

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5003)

app2.py:

from flask import Flask

app = Flask(__name__)

@app.route('/submit', methods=['POST'])
def submit_form():
    return "Form submitted successfully!"

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5004)


HAProxy Configuration with ACLs


frontend http_front
    bind *:80
    acl is_post_method method POST
    acl is_submit_path path_beg /submit
    use_backend post_backend if is_post_method is_submit_path
    default_backend general_backend

backend post_backend
    server app2 127.0.0.1:5004 check

backend general_backend
    server app1 127.0.0.1:5003 check

ACLs:

  • is_post_method checks for the POST method.
  • is_submit_path checks if the path starts with /submit.

Traffic Handling: The traffic is directed to post_backend if both the ACLs match, otherwise, it goes to general_backend.

HAProxy EP 3: Sarah’s Adventure with L7 Load Balancing and HAProxy

Meet Sarah, a backend developer at “BrightApps,” a fast-growing startup specializing in custom web applications. Recently, BrightApps launched a new service called “FitGuru,” a health and fitness platform that quickly gained traction. However, as the platform’s user base started to grow, the team noticed performance issues—page loads were slow, and users began to complain.

Sarah knew that simply scaling up their backend servers might not solve the problem. What they needed was a smarter way to handle incoming traffic and distribute it across their servers. That’s when she decided to dive into the world of Layer 7 (L7) load balancing with HAProxy.

Understanding L7 Load Balancing

Layer 7 load balancing operates at the Application Layer of the OSI model. Unlike Layer 4 (L4) load balancing, which only considers information from the Transport Layer (like IP addresses and ports), an L7 load balancer examines the actual content of the HTTP requests. This deeper inspection allows it to make more intelligent decisions on how to route traffic.

Here’s why Sarah chose L7 load balancing for “FitGuru”:

  1. Content-Based Routing: Sarah could route requests to different servers based on the URL path, HTTP headers, cookies, or even specific parameters in the request. For example, requests for video content could be directed to a server optimized for handling media, while API requests could go to a server focused on data processing.
  2. SSL Termination: The L7 load balancer could handle the SSL encryption and decryption, relieving the backend servers from this CPU-intensive task.
  3. Advanced Health Checks: Sarah could set up health checks that simulate real user traffic to ensure backend servers are actually serving content correctly, not just responding to pings.
  4. Enhanced Security: With L7, she could filter out malicious traffic more effectively by inspecting request contents, blocking suspicious patterns, and protecting the app from common web attacks.

Step 1: Sarah’s Plan with HAProxy as an HTTP Proxy

Sarah decided to configure HAProxy as an HTTP proxy. This way, it would operate at Layer 7 and provide advanced traffic management capabilities. She had a few objectives:

  • Route traffic based on the URL path to different servers.
  • Offload SSL termination to HAProxy.
  • Serve static files from specific backend servers and dynamic content from others.

Sarah started with a simple Flask application to test her configuration:

Flask Application Setup

Sarah created two basic Flask apps:

  1. Static Content Server (static_app.py):

from flask import Flask, send_from_directory

app = Flask(__name__)

@app.route('/static/<path:filename>')
def serve_static(filename):
    return send_from_directory('static', filename)

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5001)

This app served static content from a folder named static.

  1. Dynamic Content Server (dynamic_app.py):

from flask import Flask

app = Flask(__name__)

@app.route('/')
def home():
    return "Welcome to FitGuru!"

@app.route('/api/data')
def api_data():
    return {"data": "Some dynamic data"}

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5002)

This app handled dynamic requests like API endpoints and the home page.

Step 2: Configuring HAProxy for HTTP Proxy

Sarah then moved on to configure HAProxy. She created an HAProxy configuration file (haproxy.cfg) to route traffic based on URL paths


global
    log stdout format raw local0
    maxconn 4096

defaults
    mode http
    log global
    option httplog
    option dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend http_front
    bind *:80

    acl is_static path_beg /static
    use_backend static_backend if is_static
    default_backend dynamic_backend

backend static_backend
    balance roundrobin
    server static1 127.0.0.1:5001 check

backend dynamic_backend
    balance roundrobin
    server dynamic1 127.0.0.1:5002 check

Explanation of the Configuration

  1. Frontend Configuration (http_front):
    • The frontend listens on ports 80 (HTTP).
    • An ACL (is_static) is defined to identify requests for static content based on the URL path prefix /static.
    • Requests that match the is_static ACL are routed to the static_backend. All other requests are routed to the dynamic_backend.
  2. Backend Configuration:
    • The static_backend handles static content requests and uses a round-robin strategy to distribute traffic between the servers (in this case, just static1).
    • The dynamic_backend handles all other requests in a similar manner.

Step 3: Deploying HAProxy with Docker

Sarah decided to use Docker to deploy HAProxy quickly:

Dockerfile for HAProxy:


FROM haproxy:2.4

COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

Build and Run:


docker build -t haproxy-http .
docker run -d -p 80:80 -p 443:443 haproxy-http


This command runs HAProxy in a Docker container, listening on ports 80.

Step 4: Testing the Setup

Now, it was time to test!

  1. Static Content Test:
    • Sarah visited http://localhost:5000/static/logo.png. The L7 load balancer identified the /static path and routed the request to static_backend.
  2. Dynamic Content Test:
    • Visiting http://localhost:5000 or http://localhost:5000/api/data confirmed that requests were routed to the dynamic_backend as expected.

The Result: A Smoother Experience for “FitGuru”

With L7 load balancing in place, “FitGuru” was now more responsive and could efficiently handle the incoming traffic surge:

  • Optimized Performance: Static content requests were efficiently served from servers dedicated to that purpose, while dynamic content was processed by more capable machines.
  • Enhanced Security: SSL termination was handled by HAProxy, and the backend servers were freed from CPU-intensive encryption tasks.
  • Flexible Traffic Management: Sarah could now easily add or modify rules to adapt to changing traffic patterns or requirements.

By implementing Layer 7 load balancing with HAProxy, Sarah provided “FitGuru” with a robust and scalable solution that ensured a seamless user experience, even during peak times. Now, she could confidently tackle the growing demands of their expanding user base, knowing the platform was built to handle whatever traffic came its way.

Layer 7 load balancing was more than just a tool; it was a strategy that allowed Sarah to understand, control, and optimize traffic in a way that best suited their application’s unique needs. And with HAProxy, she had all the flexibility and power she needed to keep “FitGuru” running smoothly.

HAProxy EP 2: TCP Proxy for Flask Application

Meet Jafer, a backend engineer tasked with ensuring the new microservice they are building can handle high traffic smoothly. The microservice is a Flask application that needs to be accessed over TCP, and Jafer decided to use HAProxy to act as a TCP proxy to manage incoming traffic.

This guide will walk you through how Jafer sets up HAProxy to work as a TCP proxy for a sample Flask application.

Why Use HAProxy as a TCP Proxy?

HAProxy as a TCP proxy operates at Layer 4 (Transport Layer) of the OSI model. It forwards raw TCP connections from clients to backend servers without inspecting the contents of the packets. This is ideal for scenarios where:

  • You need to handle non-HTTP traffic, such as databases or other TCP-based applications.
  • You want to perform load balancing without application-level inspection.
  • Your services are using protocols other than HTTP/HTTPS.

In this layer, it can’t read the packets but can identify the ip address of the client.

Step 1: Set Up a Sample Flask Application

First, Jafer created a simple Flask application that listens on a TCP port. Let’s create a file named app.py

from flask import Flask, request

app = Flask(__name__)

@app.route('/', methods=['GET'])
def home():
    return "Hello from Flask over TCP!"

if __name__ == "__main__":
    app.run(host='0.0.0.0', port=5000)  # Run the app on port 5000


Step 2: Dockerize the Flask Application

To make the Flask app easy to deploy, Jafer decided to containerize it using Docker.

Create a Dockerfile

# Use an official Python runtime as a parent image
FROM python:3.9-slim

# Set the working directory
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install any needed packages specified in requirements.txt
RUN pip install flask

# Make port 5000 available to the world outside this container
EXPOSE 5000

# Run app.py when the container launches
CMD ["python", "app.py"]


To build and run the Docker container, use the following commands

docker build -t flask-app .
docker run -d -p 5000:5000 flask-app

This will start the Flask application on port 5000.

Step 3: Configure HAProxy as a TCP Proxy

Now, Jafer needs to configure HAProxy to act as a TCP proxy for the Flask application.

Create an HAProxy configuration file named haproxy.cfg

global
    log stdout format raw local0
    maxconn 4096

defaults
    mode tcp  # Operating in TCP mode
    log global
    option tcplog
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

frontend tcp_front
    bind *:4000  # Bind to port 4000 for incoming TCP traffic
    default_backend flask_backend

backend flask_backend
    balance roundrobin  # Use round-robin load balancing
    server flask1 127.0.0.1:5000 check  # Proxy to Flask app running on port 5000

In this configuration:

  • Mode TCP: HAProxy is set to work in TCP mode.
  • Frontend: Listens on port 4000 and forwards incoming TCP traffic to the backend.
  • Backend: Contains a single server (flask1) where the Flask app is running.

Step 4: Run HAProxy with the Configuration

To start HAProxy with the above configuration, you can use Docker to run HAProxy in a container.

Create a Dockerfile for HAProxy

FROM haproxy:2.4

# Copy the HAProxy configuration file to the container
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

Build and run the HAProxy Docker container

docker build -t haproxy-tcp .
docker run -d -p 4000:4000 haproxy-tcp

This will start HAProxy on port 4000, which is configured to proxy TCP traffic to the Flask application running on port 5000.

Step 5: Test the TCP Proxy Setup

To test the setup, open a web browser or use curl to send a request to the HAProxy server

curl http://localhost:4000/

You should see the response

Hello from Flask over TCP!

This confirms that HAProxy is successfully proxying TCP traffic to the Flask application.

Step 6: Scaling Up

If Jafer wants to scale the application to handle more traffic, he can add more backend servers to the haproxy.cfg file

backend flask_backend
    balance roundrobin
    server flask1 127.0.0.1:5000 check
    server flask2 127.0.0.1:5001 check

Jafer could run another instance of the Flask application on a different port (5001), and HAProxy would balance the TCP traffic between the two instances.

Conclusion

By configuring HAProxy as a TCP proxy, Jafer could efficiently manage and balance incoming traffic to their Flask application. This setup ensures scalability and reliability for any TCP-based service, not just HTTP-based ones.

HAProxy EP 1: Traffic Police for Web

In the world of web applications, imagine you’re running a very popular pizza place. Every evening, customers line up for a delicious slice of pizza. But if your single cashier can’t handle all the orders at once, customers might get frustrated and leave.

What if you could have a system that ensures every customer gets served quickly and efficiently? Enter HAProxy, a tool that helps manage and balance the flow of web traffic so that no single server gets overwhelmed.

Here’s a straightforward guide to understanding HAProxy, installing it, and setting it up to make your web application run smoothly.

What is HAProxy?

HAProxy stands for High Availability Proxy. It’s like a traffic director for your web traffic. It takes incoming requests (like people walking into your pizza place) and decides which server (or pizza station) should handle each request. This way, no single server gets too busy, and everything runs more efficiently.

Why Use HAProxy?

  • Handles More Traffic: Distributes incoming traffic across multiple servers so no single one gets overloaded.
  • Increases Reliability: If one server fails, HAProxy directs traffic to the remaining servers.
  • Improves Performance: Ensures that users get faster responses because the load is spread out.

Installing HAProxy

Here’s how you can install HAProxy on a Linux system:

  1. Open a Terminal: You’ll need to access your command line interface to install HAProxy.
  2. Install HAProxy: Type the following command and hit enter

sudo apt-get update
sudo apt-get install haproxy

3. Check Installation: Once installed, you can verify that HAProxy is running by typing


sudo systemctl status haproxy

This command shows you the current status of HAProxy, ensuring it’s up and running.

Configuring HAProxy

HAProxy’s configuration file is where you set up how it should handle incoming traffic. This file is usually located at /etc/haproxy/haproxy.cfg. Let’s break down the main parts of this configuration file,

1. The global Section

The global section is like setting the rules for the entire pizza place. It defines general settings for HAProxy itself, such as how it should operate, what kind of logging it should use, and what resources it needs. Here’s an example of what you might see in the global section


global
    log /dev/log local0
    log /dev/log local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660
    user haproxy
    group haproxy
    daemon

Let’s break it down line by line:

  • log /dev/log local0: This line tells HAProxy to send log messages to the system log at /dev/log and to use the local0 logging facility. Logs help you keep track of what’s happening with HAProxy.
  • log /dev/log local1 notice: Similar to the previous line, but it uses the local1 logging facility and sets the log level to notice, which is a type of log message indicating important events.
  • chroot /var/lib/haproxy: This line tells HAProxy to run in a restricted area of the file system (/var/lib/haproxy). It’s a security measure to limit access to the rest of the system.
  • stats socket /run/haproxy/admin.sock mode 660: This sets up a special socket (a kind of communication endpoint) for administrative commands. The mode 660 part defines the permissions for this socket, allowing specific users to manage HAProxy.
  • user haproxy: Specifies that HAProxy should run as the user haproxy. Running as a specific user helps with security.
  • group haproxy: Similar to the user directive, this specifies that HAProxy should run under the haproxy group.
  • daemon: This tells HAProxy to run as a background service, rather than tying up a terminal window.

2. The defaults Section

The defaults section sets up default settings for HAProxy’s operation and is like defining standard procedures for the pizza place. It applies default configurations to both the frontend and backend sections unless overridden. Here’s an example of a defaults section


defaults
    log     global
    option  httplog
    option  dontlognull
    timeout connect 5000ms
    timeout client  50000ms
    timeout server  50000ms

Here’s what each line means:

  • log global: Tells HAProxy to use the logging settings defined in the global section for logging.
  • option httplog: Enables HTTP-specific logging. This means HAProxy will log details about HTTP requests and responses, which helps with troubleshooting and monitoring.
  • option dontlognull: Prevents logging of connections that don’t generate any data (null connections). This keeps the logs cleaner and more relevant.
  • timeout connect 5000ms: Sets the maximum time HAProxy will wait when trying to connect to a backend server to 5000 milliseconds (5 seconds). If the connection takes longer, it will be aborted.
  • timeout client 50000ms: Defines the maximum time HAProxy will wait for data from the client to 50000 milliseconds (50 seconds). If the client doesn’t send data within this time, the connection will be closed.
  • timeout server 50000ms: Similar to timeout client, but it sets the maximum time to wait for data from the server to 50000 milliseconds (50 seconds).

3. Frontend Section

The frontend section defines how HAProxy listens for incoming requests. Think of it as the entrance to your pizza place.


frontend http_front
    bind *:80
    default_backend http_back
  • frontend http_front: This is a name for your frontend configuration.
  • bind *:80: Tells HAProxy to listen for traffic on port 80 (the standard port for web traffic).
  • default_backend http_back: Specifies where the traffic should be sent (to the backend section).

4. Backend Section

The backend section describes where the traffic should be directed. Think of it as the different pizza stations where orders are processed.


backend http_back
    balance roundrobin
    server app1 192.168.1.2:5000 check
    server app2 192.168.1.3:5000 check
    server app3 192.168.1.4:5000 check
  • backend http_back: This is a name for your backend configuration.
  • balance roundrobin: Distributes traffic evenly across servers.
  • server app1 192.168.1.2:5000 check: Specifies a server (app1) at IP address 192.168.1.2 on port 5000. The check option ensures HAProxy checks if the server is healthy before sending traffic to it.
  • server app2 and server app3: Additional servers to handle traffic.

Testing Your Configuration

After setting up your configuration, you’ll need to restart HAProxy to apply the changes:


sudo systemctl restart haproxy

To check if everything is working, you can use a web browser or a tool like curl to send requests to HAProxy and see if it correctly distributes them across your servers.

Mastering Request Retrying in Python with Tenacity: A Developer’s Journey

Meet Jafer, a talented developer (self boast) working at a fast growing tech company. His team is building an innovative app that fetches data from multiple third-party APIs in realtime to provide users with up-to-date information.

Everything is going smoothly until one day, a spike in traffic causes their app to face a wave of “HTTP 500” and “Timeout” errors. Requests start failing left and right, and users are left staring at the dreaded “Data Unavailable” message.

Jafer realizes that he needs a way to make their app more resilient against these unpredictable network hiccups. That’s when he discovers Tenacity a powerful Python library designed to help developers handle retries gracefully.

Join Jafer as he dives into Tenacity and learns how to turn his app from fragile to robust with just a few lines of code!

Step 0: Mock FLASK Api

from flask import Flask, jsonify, make_response
import random
import time

app = Flask(__name__)

# Scenario 1: Random server errors
@app.route('/random_error', methods=['GET'])
def random_error():
    if random.choice([True, False]):
        return make_response(jsonify({"error": "Server error"}), 500)  # Simulate a 500 error randomly
    return jsonify({"message": "Success"})

# Scenario 2: Timeouts
@app.route('/timeout', methods=['GET'])
def timeout():
    time.sleep(5)  # Simulate a long delay that can cause a timeout
    return jsonify({"message": "Delayed response"})

# Scenario 3: 404 Not Found error
@app.route('/not_found', methods=['GET'])
def not_found():
    return make_response(jsonify({"error": "Not found"}), 404)

# Scenario 4: Rate-limiting (simulated with a fixed chance)
@app.route('/rate_limit', methods=['GET'])
def rate_limit():
    if random.randint(1, 10) <= 3:  # 30% chance to simulate rate limiting
        return make_response(jsonify({"error": "Rate limit exceeded"}), 429)
    return jsonify({"message": "Success"})

# Scenario 5: Empty response
@app.route('/empty_response', methods=['GET'])
def empty_response():
    if random.choice([True, False]):
        return make_response("", 204)  # Simulate an empty response with 204 No Content
    return jsonify({"message": "Success"})

if __name__ == '__main__':
    app.run(host='localhost', port=5000, debug=True)

To run the Flask app, use the command,

python mock_server.py

Step 1: Introducing Tenacity

Jafer decides to start with the basics. He knows that Tenacity will allow him to retry failed requests without cluttering his codebase with complex loops and error handling. So, he installs the library,

pip install tenacity

With Tenacity ready, Jafer decides to tackle his first problem, retrying a request that fails due to server errors.

Step 2: Retrying on Exceptions

He writes a simple function that fetches data from an API and wraps it with Tenacity’s @retry decorator

import requests
import logging
from tenacity import before_log, after_log
from tenacity import retry, stop_after_attempt, wait_fixed

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@retry(stop=stop_after_attempt(3),
        wait=wait_fixed(2),
        before=before_log(logger, logging.INFO),
        after=after_log(logger, logging.INFO))
def fetch_random_error():
    response = requests.get('http://localhost:5000/random_error')
    response.raise_for_status()  # Raises an HTTPError for 4xx/5xx responses
    return response.json()
 
if __name__ == '__main__':
    try:
        data = fetch_random_error()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

This code will attempt the request up to 3 times, waiting 2 seconds between each try. Jafer feels confident that this will handle the occasional hiccup. However, he soon realizes that he needs more control over which exceptions trigger a retry.

Step 3: Handling Specific Exceptions

Jafer’s app sometimes receives a “404 Not Found” error, which should not be retried because the resource doesn’t exist. He modifies the retry logic to handle only certain exceptions,

import requests
import logging
from tenacity import before_log, after_log
from requests.exceptions import HTTPError, Timeout
from tenacity import retry, retry_if_exception_type, stop_after_attempt, wait_fixed
 

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@retry(stop=stop_after_attempt(3),
        wait=wait_fixed(2),
        retry=retry_if_exception_type((HTTPError, Timeout)),
        before=before_log(logger, logging.INFO),
        after=after_log(logger, logging.INFO))
def fetch_data():
    response = requests.get('http://localhost:5000/timeout', timeout=2)  # Set a short timeout to simulate failure
    response.raise_for_status()
    return response.json()

if __name__ == '__main__':
    try:
        data = fetch_data()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

Now, the function retries only on HTTPError or Timeout, avoiding unnecessary retries for a “404” error. Jafer’s app is starting to feel more resilient!

Step 4: Implementing Exponential Backoff

A few days later, the team notices that they’re still getting rate-limited by some APIs. Jafer recalls the concept of exponential backoff a strategy where the wait time between retries increases exponentially, reducing the load on the server and preventing further rate limiting.

He decides to implement it,

import requests
import logging
from tenacity import before_log, after_log
from tenacity import retry, stop_after_attempt, wait_exponential

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)


@retry(stop=stop_after_attempt(5),
       wait=wait_exponential(multiplier=1, min=2, max=10),
       before=before_log(logger, logging.INFO),
       after=after_log(logger, logging.INFO))
def fetch_rate_limit():
    response = requests.get('http://localhost:5000/rate_limit')
    response.raise_for_status()
    return response.json()
 
if __name__ == '__main__':
    try:
        data = fetch_rate_limit()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

With this code, the wait time starts at 2 seconds and doubles with each retry, up to a maximum of 10 seconds. Jafer’s app is now much less likely to be rate-limited!

Step 5: Retrying Based on Return Values

Jafer encounters another issue: some APIs occasionally return an empty response (204 No Content). These cases should also trigger a retry. Tenacity makes this easy with the retry_if_result feature,

import requests
import logging
from tenacity import before_log, after_log

from tenacity import retry, stop_after_attempt, retry_if_result

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
  

@retry(retry=retry_if_result(lambda x: x is None), stop=stop_after_attempt(3), before=before_log(logger, logging.INFO),
       after=after_log(logger, logging.INFO))
def fetch_empty_response():
    response = requests.get('http://localhost:5000/empty_response')
    if response.status_code == 204:
        return None  # Simulate an empty response
    response.raise_for_status()
    return response.json()
 
if __name__ == '__main__':
    try:
        data = fetch_empty_response()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

Now, the function retries when it receives an empty response, ensuring that users get the data they need.

Step 6: Combining Multiple Retry Conditions

But Jafer isn’t done yet. Some situations require combining multiple conditions. He wants to retry on HTTPError, Timeout, or a None return value. With Tenacity’s retry_any feature, he can do just that,

import requests
import logging
from tenacity import before_log, after_log

from requests.exceptions import HTTPError, Timeout
from tenacity import retry_any, retry, retry_if_exception_type, retry_if_result, stop_after_attempt
 
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

@retry(retry=retry_any(retry_if_exception_type((HTTPError, Timeout)), retry_if_result(lambda x: x is None)), stop=stop_after_attempt(3), before=before_log(logger, logging.INFO),
       after=after_log(logger, logging.INFO))
def fetch_data():
    response = requests.get("http://localhost:5000/timeout")
    if response.status_code == 204:
        return None
    response.raise_for_status()
    return response.json()

if __name__ == '__main__':
    try:
        data = fetch_data()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

This approach covers all his bases, making the app even more resilient!

Step 7: Logging and Tracking Retries

As the app scales, Jafer wants to keep an eye on how often retries happen and why. He decides to add logging,

import logging
import requests
from tenacity import before_log, after_log
from tenacity import retry, stop_after_attempt, wait_fixed

 
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
 
@retry(stop=stop_after_attempt(2), wait=wait_fixed(2),
       before=before_log(logger, logging.INFO),
       after=after_log(logger, logging.INFO))
def fetch_data():
    response = requests.get("http://localhost:5000/timeout", timeout=2)
    response.raise_for_status()
    return response.json()

if __name__ == '__main__':
    try:
        data = fetch_data()
        print("Data fetched successfully:", data)
    except Exception as e:
        print("Failed to fetch data:", str(e))

This logs messages before and after each retry attempt, giving Jafer full visibility into the retry process. Now, he can monitor the app’s behavior in production and quickly spot any patterns or issues.

The Happy Ending

With Tenacity, Jafer has transformed his app into a resilient powerhouse that gracefully handles intermittent failures. Users are happy, the servers are humming along smoothly, and Jafer’s team has more time to work on new features rather than firefighting network errors.

By mastering Tenacity, Jafer has learned that handling network failures gracefully can turn a fragile app into a robust and reliable one. Whether it’s dealing with flaky APIs, network blips, or rate limits, Tenacity is his go-to tool for retrying operations in Python.

So, the next time your app faces unpredictable network challenges, remember Jafer’s story and give Tenacity a try you might just save the day!

The Search for the Perfect Media Server: A Journey of Discovery

Dinesh, an avid movie collector and music lover, had a growing problem. His laptop was bursting at the seams with countless movies, albums, and family photos. Every time he wanted to watch a movie or listen to her carefully curated playlists, he had to sit around his laptop. And if he wanted to share something with his friends, it meant copying with USB drives or spending hours transferring files.

One Saturday evening, after yet another struggle to connect his laptop to his smart TV via a mess of cables, Dinesh decided it was time for a change. He needed a solution that would let his access all his media from any device in his house – phone, tablet, and TV. He needed a media server.

Dinesh fired up his browser and began his search: “How to stream media to all my devices.” He gone through the results – Plex, Jellyfin, Emby… Each option seemed promising but felt too complex, requiring subscriptions or heavy installations.

Frustrated, Dinesh thought, “There must be something simpler. I don’t need all the bells and whistles; I just want to access my files from anywhere in my house.” He refined her search: “lightweight media server for Linux.”

There it was – MiniDLNA. Described as a simple, lightweight DLNA server that was easy to set up and perfect for home use, MiniDLNA (also known as ReadyMedia) seemed to be exactly what Dinesh needed.

MiniDLNA (also known as ReadyMedia) is a lightweight, simple server for streaming media (like videos, music, and pictures) to devices on your network. It is compatible with various DLNA/UPnP (Digital Living Network Alliance/Universal Plug and Play) devices such as smart TVs, media players, gaming consoles, etc.

How to Use MiniDLNA

Here’s a step-by-step guide to setting up and using MiniDLNA on a Linux based system.

1. Install MiniDLNA

To get started, you need to install MiniDLNA. The installation steps can vary slightly depending on your operating system.

For Debian/Ubuntu-based systems:

sudo apt update
sudo apt install minidlna

For Red Hat/CentOS-based systems:

First, enable the EPEL repository,

sudo yum install epel-release

Then, install MiniDLNA,

sudo yum install minidlna

2. Configure MiniDLNA

Once installed, you need to configure MiniDLNA to tell it where to find your media files.

a. Open the MiniDLNA configuration file in a text editor

sudo nano /etc/minidlna.conf

b. Configure the following parameters:

  • media_dir: Set this to the directories where your media files (music, pictures, and videos) are stored. You can specify different media types for each directory.
media_dir=A,/path/to/music  # 'A' is for audio
media_dir=V,/path/to/videos # 'V' is for video
media_dir=P,/path/to/photos # 'P' is for pictures
  • db_dir=: The directory where the database and cache files are stored.
db_dir=/var/cache/minidlna
  • log_dir=: The directory where log files are stored.
log_dir=/var/log/minidlna
  • friendly_name=: The name of your media server. This will appear on your DLNA devices.
friendly_name=Laptop SJ
  • notify_interval=: The interval in seconds that MiniDLNA will notify clients of its presence. The default is 900 (15 minutes).
notify_interval=900

c. Save and close the file (Ctrl + X, Y, Enter in Nano).

3. Start the MiniDLNA Service

After configuration, start the MiniDLNA service

sudo systemctl start minidlna

To enable it to start at boot,

sudo systemctl enable minidlna

4. Rescan Media Files

To make MiniDLNA scan your media files and add them to its database, you can force a rescan with

sudo minidlnad -R

5. Access Your Media on DLNA/UPnP Devices

Now, your MiniDLNA server should be up and running. You can access your media from any DLNA-compliant device on your network:

  • On your Smart TV, look for the “Media Server” or “DLNA” option in the input/source menu.
  • On a Windows PC, go to This PC or Network and find your DLNA server under “Media Devices.”
  • On Android, use a media player app like VLC or BubbleUPnP to find your server.

6. Check Logs and Troubleshoot

If you encounter any issues, you can check the logs for more information

sudo tail -f /var/log/minidlna/minidlna.log

To setup for a single user

Disable the global daemon

sudo service minidlna stop
sudo update-rc.d minidlna disable

Create the necessary local files and directories as regular user and edit the configuration

mkdir -p ~/.minidlna/cache
cd ~/.minidlna
cp /etc/minidlna.conf .
$EDITOR minidlna.conf

Configure as you would globally above but these definitions need to be defined locally

db_dir=/home/$USER/.minidlna/cache
log_dir=/home/$USER/.minidlna 

To start the daemon locally

minidlnad -f /home/$USER/.minidlna/minidlna.conf -P /home/$USER/.minidlna/minidlna.pid

To stop the local daemon

xargs kill </home/$USER/.minidlna/minidlna.pid

To rebuild the database,

minidlnad -f /home/$USER/.minidlna/minidlna.conf -R

For more info: https://help.ubuntu.com/community/MiniDLNA

Additional Tips

  • Firewall Rules: Ensure that your firewall settings allow traffic on the MiniDLNA port (8200 by default) and UPnP (typically port 1900 for UDP).
  • Update Media Files: Whenever you add or remove files from your media directory, run minidlnad -R to update the database.
  • Multiple Media Directories: You can have multiple media_dir lines in your configuration if your media is spread across different folders.

To set up MiniDLNA with VLC Media Player so you can stream content from your MiniDLNA server, follow these steps:

Let’s see how to use this in VLC

On Machine

1. Install VLC Media Player

Make sure you have VLC Media Player installed on your device. If not, you can download it from the official VLC website.

2. Open VLC Media Player

Launch VLC Media Player on your computer.

3. Open the UPnP/DLNA Network Stream

  1. Go to the “View” Menu:
    • On the VLC menu bar, click on View and then Playlist or press Ctrl + L (Windows/Linux) or Cmd + Shift + P (Mac).
  2. Locate Your DLNA Server:
    • In the left sidebar, you will see an option for Local Network.
    • Click on Universal Plug'n'Play or UPnP.
    • VLC will search for available DLNA/UPnP servers on your network.
  3. Select Your MiniDLNA Server:
    • After a few moments, your MiniDLNA server should appear under the UPnP section.
    • Click on your server name (e.g., My DLNA Server).
  4. Browse and Play Media:
    • You will see the folders you configured (e.g., Music, Videos, Pictures).
    • Navigate through the folders and double-click on a media file to start streaming.

4. Alternative Method: Open Network Stream

If you know the IP address of your MiniDLNA server, you can connect directly:

  1. Open Network Stream:
    • Click on Media in the menu bar and select Open Network Stream... or press Ctrl + N (Windows/Linux) or Cmd + N (Mac).
  2. Enter the URL:
    • Enter the URL of your MiniDLNA server in the format http://[Server IP]:8200.
    • Example: http://192.168.1.100:8200.
  3. Click “Play”:
    • Click on the Play button to start streaming from your MiniDLNA server.

5. Tips for Better Streaming Experience

  • Ensure the Server is Running: Make sure the MiniDLNA server is running and the media files are correctly indexed.
  • Network Stability: A stable local network connection is necessary for smooth streaming. Use a wired connection if possible or ensure a strong Wi-Fi signal.
  • Firewall Settings: Ensure that the firewall on your server allows traffic on port 8200 (or the port specified in your MiniDLNA configuration).

On Android

To set up and stream content from MiniDLNA using an Android app, you will need a DLNA/UPnP client app that can discover and stream media from DLNA servers. Several apps are available for this purpose, such as VLC for Android, BubbleUPnP, Kodi, and others. Here’s how to use VLC for Android and BubbleUPnP, two popular choices

Using VLC for Android

  1. Install VLC for Android:
  2. Open VLC for Android:
    • Launch the VLC app on your Android device.
  3. Access the Local Network:
    • Tap on the menu button (three horizontal lines) in the upper-left corner of the screen.
    • Select Local Network from the sidebar menu.
  4. Find Your MiniDLNA Server:
    • VLC will automatically search for DLNA/UPnP servers on your local network. After a few moments, your MiniDLNA server should appear in the list.
    • Tap on the name of your MiniDLNA server (e.g., My DLNA Server).
  5. Browse and Play Media:
    • You will see your media folders (e.g., Music, Videos, Pictures) as configured in your MiniDLNA setup.
    • Navigate to the desired folder and tap on any media file to start streaming.

Additional Tips

  • Ensure MiniDLNA is Running: Make sure your MiniDLNA server is properly configured and running on your local network.
  • Check Network Connection: Ensure your Android device is connected to the same local network (Wi-Fi) as the MiniDLNA server.
  • Firewall Settings: If you are not seeing the MiniDLNA server in your app, ensure that the server’s firewall settings allow DLNA/UPnP traffic.

Some Problems That you may face

  1. minidlna.service: Main process exited, code=exited, status=255/EXCEPTION - check the logs. Mostly its due to an instance already running on port 8200. Kill that and reload the db. lsof -i :8200 will give PID. and `kill -9 <PID>` will kill the process.
  2. If the media files is not refreshing, then try minidlnad -f /home/$USER/.minidlna/minidlna.conf -R or `sudo minidlnad -R`

Lucifer and the Git-Powered Calculator: The Complete Adventure

In losangels , a young coder named Lucifer set out on a mission to build his very own calculator. Along the way, he learned how to use Git, a powerful tool that would help him track his progress and manage his code. Here’s the complete story of how Lucifer built his calculator, step by step, with the power of Git.

Step 1: Setting Up the Project with git init

Lucifer started by creating a new directory for his project and initializing a Git repository. This was like setting up a magical vault to store all his coding adventures.


mkdir MagicCalculator
cd MagicCalculator
git init

This command created the .git directory inside the MagicCalculator folder, where Git would keep track of everything.

Step 2: Configuring His Identity with git config

Before getting too far, Lucifer needed to make sure Git knew who he was. He configured his username and email address, so every change he made would be recorded in his name.


git config --global user.name "Lucifer"
git config --global user.email "lucifer@codeville.com"

Step 3: Writing the Addition Function and Staging It with git add

Lucifer began his calculator project by writing a simple function to add two numbers. He created a new Python file named main.py and added the following code,


# main.py

def add(x, y):
    return x + y

# Simple test to ensure it's working
print(add(5, 3))  # Output should be 8

Happy with his progress, Lucifer used the git add command to stage his changes. This was like preparing the code to be saved in Git’s memory.


git add main.py

Step 4: Committing the First Version with git commit

Next, Lucifer made his first commit. This saved the current state of his project, along with a message describing what he had done.


git commit -m "Added addition function"

Now, Git had recorded the addition function as the first chapter in the history of Lucifer’s project.

Step 5: Adding More Functions and Committing Them

Lucifer continued to add more functions to his calculator. First, he added subtraction,


# main.py

def add(x, y):
    return x + y

def subtract(x, y):
    return x - y

# Simple tests
print(add(5, 3))       # Output: 8
print(subtract(5, 3))  # Output: 2

He then staged and committed the subtraction function,


git add main.py
git commit -m "Added subtraction function"

Lucifer added multiplication and division next,


# main.py

def add(x, y):
    return x + y

def subtract(x, y):
    return x - y

def multiply(x, y):
    return x * y

def divide(x, y):
    if y != 0:
        return x / y
    else:
        return "Cannot divide by zero!"

# Simple tests
print(add(5, 3))       # Output: 8
print(subtract(5, 3))  # Output: 2
print(multiply(5, 3))  # Output: 15
print(divide(5, 3))    # Output: 1.666...
print(divide(5, 0))    # Output: Cannot divide by zero!

Again, he staged and committed these changes,


git add main.py
git commit -m "Added multiplication and division functions"

Step 6: Branching Out with git branch and git checkout

Lucifer had an idea to add a feature that would let users choose which operation to perform. However, he didn’t want to risk breaking his existing code. So, he created a new branch to work on this feature.


git branch operation-choice
git checkout operation-choice

Now on the operation-choice branch, Lucifer wrote the code to let users select an operation,


# main.py

def add(x, y):
    return x + y

def subtract(x, y):
    return x - y

def multiply(x, y):
    return x * y

def divide(x, y):
    if y != 0:
        return x / y
    else:
        return "Cannot divide by zero!"

def calculator():
    print("Select operation:")
    print("1. Add")
    print("2. Subtract")
    print("3. Multiply")
    print("4. Divide")

    choice = input("Enter choice (1/2/3/4): ")

    num1 = float(input("Enter first number: "))
    num2 = float(input("Enter second number: "))

    if choice == '1':
        print(f"{num1} + {num2} = {add(num1, num2)}")

    elif choice == '2':
        print(f"{num1} - {num2} = {subtract(num1, num2)}")

    elif choice == '3':
        print(f"{num1} * {num2} = {multiply(num1, num2)}")

    elif choice == '4':
        print(f"{num1} / {num2} = {divide(num1, num2)}")

    else:
        print("Invalid input")

# Run the calculator
calculator()

Step 7: Merging the Feature into the Main Branch

After testing his new feature and making sure it worked, Lucifer was ready to merge it back into the main branch. He switched back to the main branch and merged the changes


git checkout main
git merge operation-choice

With this, the feature was successfully added to his calculator project.

Conclusion: Lucifer’s Git-Powered Calculator

By the end of his adventure, Lucifer had built a fully functional calculator and learned how to use Git to manage his code. His calculator could add, subtract, multiply, and divide, and even let users choose which operation to perform.

Thanks to Git, Lucifer’s project was well-organized, and he had a complete history of all the changes he made. He knew that if he ever needed to revisit an old version or experiment with new features, Git would be there to help.

Lucifer’s calculator project was a success, and with his newfound Git skills, he felt ready to take on even bigger challenges in the future.

Different Database Models

Database models define the structure, relationships, and operations that can be performed on a database. Different database models are used based on the specific needs of an application or organization. Here are the most common types of database models:

1. Hierarchical Database Model

  • Structure: Data is organized in a tree-like structure with a single root, where each record has a single parent but can have multiple children.
  • Usage: Best for applications with a clear hierarchical relationship, like organizational structures or file systems.
  • Example: IBM’s Information Management System (IMS).
  • Advantages: Fast access to data through parent-child relationships.
  • Disadvantages: Rigid structure; difficult to reorganize or restructure.

2. Network Database Model

  • Structure: Data is organized in a graph structure, where each record can have multiple parent and child records, forming a network of relationships.
  • Usage: Useful for complex relationships, such as in telecommunications or transportation networks.
  • Example: Integrated Data Store (IDS).
  • Advantages: Flexible representation of complex relationships.
  • Disadvantages: Complex design and navigation; can be difficult to maintain.

3. Relational Database Model

  • Structure: Data is organized into tables (relations) where each table consists of rows (records) and columns (fields). Relationships between tables are managed through keys.
  • Usage: Widely used in various applications, including finance, retail, and enterprise software.
  • Example: MySQL, PostgreSQL, Oracle Database, Microsoft SQL Server.
  • Advantages: Simplicity, data integrity, flexibility in querying through SQL.
  • Disadvantages: Can be slower for very large datasets or highly complex queries.

4. Object-Oriented Database Model

  • Structure: Data is stored as objects, similar to objects in object-oriented programming. Each object contains both data and methods for processing the data.
  • Usage: Suitable for applications that require the modeling of complex data and relationships, such as CAD, CAM, and multimedia databases.
  • Example: db4o, ObjectDB.
  • Advantages: Seamless integration with object-oriented programming languages, reusability of objects.
  • Disadvantages: Complexity, not as widely adopted as relational databases.

5. Document-Oriented Database Model

  • Structure: Data is stored in document collections, with each document being a self-contained piece of data often in JSON, BSON, or XML format.
  • Usage: Ideal for content management systems, real-time analytics, and big data applications.
  • Example: MongoDB, CouchDB.
  • Advantages: Flexible schema design, scalability, ease of storing hierarchical data.
  • Disadvantages: May require denormalization, leading to potential data redundancy.

6. Key-Value Database Model

  • Structure: Data is stored as key-value pairs, where each key is unique, and the value can be a string, number, or more complex data structure.
  • Usage: Best for applications requiring fast access to simple data, such as caching, session management, and real-time analytics.
  • Example: Redis, DynamoDB, Riak.
  • Advantages: High performance, simplicity, scalability.
  • Disadvantages: Limited querying capabilities, lack of complex relationships.

7. Column-Family Database Model

  • Structure: Data is stored in columns rather than rows, with each column family containing a set of columns that are logically related.
  • Usage: Suitable for distributed databases, handling large volumes of data across multiple servers.
  • Example: Apache Cassandra, HBase.
  • Advantages: High write and read performance, efficient storage of sparse data.
  • Disadvantages: Complexity in design and maintenance, not as flexible for ad-hoc queries.

8. Graph Database Model

  • Structure: Data is stored as nodes (entities) and edges (relationships) forming a graph. Each node represents an object, and edges represent the relationships between objects.
  • Usage: Ideal for social networks, recommendation engines, fraud detection, and any scenario where relationships between entities are crucial.
  • Example: Neo4j, Amazon Neptune.
  • Advantages: Efficient traversal and querying of complex relationships, flexible schema.
  • Disadvantages: Not as efficient for operations on large sets of unrelated data.

9. Multimodel Database

  • Structure: Supports multiple data models (e.g., relational, document, graph) within a single database engine.
  • Usage: Useful for applications that require different types of data storage and querying mechanisms.
  • Example: ArangoDB, Microsoft Azure Cosmos DB.
  • Advantages: Flexibility, ability to handle diverse data requirements within a single system.
  • Disadvantages: Complexity in management and optimization.

10. Time-Series Database Model

  • Structure: Specifically designed to handle time-series data, where each record is associated with a timestamp.
  • Usage: Best for applications like monitoring, logging, and real-time analytics where data changes over time.
  • Example: InfluxDB, TimescaleDB.
  • Advantages: Optimized for handling and querying large volumes of time-stamped data.
  • Disadvantages: Limited use cases outside of time-series data.

11. NoSQL Database Model

  • Structure: An umbrella term for various non-relational database models, including key-value, document, column-family, and graph databases.
  • Usage: Ideal for handling unstructured or semi-structured data, and scenarios requiring high scalability and flexibility.
  • Example: MongoDB, Cassandra, Couchbase, Neo4j.
  • Advantages: Flexibility, scalability, high performance for specific use cases.
  • Disadvantages: Lack of standardization, potential data consistency challenges.

Summary

Each database model serves different purposes, and the choice of model depends on the specific requirements of the application, such as data structure, relationships, performance needs, and scalability. While relational databases are still the most widely used, NoSQL and specialized databases have become increasingly important for handling diverse data types and large-scale applications.

Postgres Ep 2 : Amutha Hotel and Issues with Flat Files

Once upon a time in ooty, there was a small business called “Amutha Hotel,” run by a passionate baker named Saravanan. Saravanan bakery was famous for its delicious sambar, and as his customer base grew, he needed to keep track of orders, customer information, and inventory.

Being a techie, he decided to store all this information in a flat file a simple spreadsheet named “HotelData.csv.”

The Early Days: Simple and Sweet

At first, everything was easy. Saravanan’s flat file had only a few columns, OrderID, CustomerName, Product, Quantity, and Price. Each row represented a new order, and it was simple enough to manage. Saravanan could quickly find orders, calculate totals, and even check his inventory by filtering the file.

The Business Grows: Complexity Creeps In

As the business boomed, Saravanan started offering new products, special discounts, and loyalty programs. He added more columns to her flat file, like Discount, LoyaltyPoints, and DeliveryAddress. He once-simple file began to swell with information.

Then, Saravanan decided to start tracking customer preferences and order history. He began adding multiple rows for the same customer, each representing a different order. His flat file now had repeating groups of data for each customer, and it became harder and harder to find the information he needed.

His flat file was getting out of hand. For every new order from a returning customer, he had to re-enter all their information

CustomerName, DeliveryAddress, LoyaltyPoints

over and over again. This duplication wasn’t just tedious; it started to cause mistakes. One day, he accidentally typed “John Smyth” instead of “John Smith,” and suddenly, his loyal customer was split into two different entries.

On a Busy Saturday

One busy Saturday, Saravanan opened his flat file to update the day’s orders, but instead of popping up instantly as it used to, it took several minutes to load. As he scrolled through the endless rows, his computer started to lag, and the spreadsheet software even crashed a few times. The file had become too large and cumbersome for him to handle efficiently.

Customers were waiting longer for their orders to be processed because Saravanan was struggling to find their previous details and apply the right discounts. The flat file that once served his so well was now slowing her down, and it was affecting her business.

The Journaling

Techie Saravanan started to note these issues in to a notepad. He badly wants a solution which will solve these problems. So he started listing out the problems with examples to look for a solution.

His journal continues …

Before databases became common for data storage, flat files (such as CSVs or text files) were often used to store and manage data. The data file that we use has no special structure; it’s just some lines of text that mean something to the particular application that reads it. It has no inherent structure

However, these flat files posed several challenges, particularly when dealing with repeating groups, which are essentially sets of related fields that repeat multiple times within a record. Here are some of the key problems associated with repeating groups in flat files,

1. Data Redundancy

  • Description: Repeating groups can lead to significant redundancy, as the same data might need to be repeated across multiple records.
  • Example: If an employee can have multiple skills, a flat file might need to repeat the employee’s name, ID, and other details for each skill.
  • Problem: This not only increases the file size but also makes data entry, updates, and deletions more prone to errors.

Eg: Suppose you are maintaining a flat file to track employees and their skills. Each employee can have multiple skills, which you store as repeating groups in the file.

EmployeeID, EmployeeName, Skill1, Skill2, Skill3, Skill4
1, John Doe, Python, SQL, Java, 
2, Jane Smith, Excel, PowerPoint, Python, SQL

If an employee has four skills, you need to add four columns (Skill1, Skill2, Skill3, Skill4). If an employee has more than four skills, you must either add more columns or create a new row with repeated employee details.

2. Data Inconsistency

  • Description: Repeating groups can lead to inconsistencies when data is updated.
  • Example: If an employee’s name changes, and it’s stored multiple times in different rows because of repeating skills, it’s easy for some instances to be updated while others are not.
  • Problem: This can lead to situations where the same employee is listed under different names or IDs in the same file.

Eg: Suppose you are maintaining a flat file to track employees and their skills. Each employee can have multiple skills, which you store as repeating groups in the file.

EmployeeID, EmployeeName, Skill1, Skill2, Skill3, Skill4
1, John Doe, Python, SQL, Java, 
2, Jane Smith, Excel, PowerPoint, Python, SQL

If John’s name changes to “John A. Doe,” you must manually update each occurrence of “John Doe” across all rows, which increases the chance of inconsistencies.

3. Difficulty in Querying

  • Description: Querying data in flat files with repeating groups can be cumbersome and inefficient.
  • Example: Extracting a list of unique employees with their respective skills requires complex scripting or manual processing.
  • Problem: Unlike relational databases, which use joins to simplify such queries, flat files require custom logic to manage and extract data, leading to slower processing and more potential for errors.

Eg: Suppose you are maintaining a flat file to track employees and their skills. Each employee can have multiple skills, which you store as repeating groups in the file.

EmployeeID, EmployeeName, Skill1, Skill2, Skill3, Skill4
1, John Doe, Python, SQL, Java, 
2, Jane Smith, Excel, PowerPoint, Python, SQL

Extracting a list of all employees proficient in “Python” requires you to search across multiple skill columns (Skill1, Skill2, etc.), which is cumbersome compared to a relational database where you can use a simple JOIN on a normalized EmployeeSkills table.

4. Limited Scalability

  • Description: Flat files do not scale well when the number of repeating groups or the size of the data grows.
  • Example: A file with multiple repeating fields can become extremely large and difficult to manage as the number of records increases.
  • Problem: This can lead to performance issues, such as slow read/write operations and difficulty in maintaining the file over time.

Eg: You are storing customer orders in a flat file where each customer can place multiple orders.

CustomerID, CustomerName, Order1ID, Order1Date, Order2ID, Order2Date, Order3ID, Order3Date
1001, Alice Brown, 5001, 2023-08-01, 5002, 2023-08-15, 
1002, Bob White, 5003, 2023-08-05, 

If Alice places more than three orders, you’ll need to add more columns (Order4ID, Order4Date, etc.), leading to an unwieldy file with many empty cells for customers with fewer orders.

5. Challenges in Data Integrity

  • Description: Ensuring data integrity in flat files with repeating groups is difficult.
  • Example: Enforcing rules like “an employee can only have unique skills” is nearly impossible in a flat file format.
  • Problem: This can result in duplicated or invalid data, which is hard to detect and correct without a database system.

Eg: You are storing customer orders in a flat file where each customer can place multiple orders.

CustomerID, CustomerName, Order1ID, Order1Date, Order2ID, Order2Date, Order3ID, Order3Date
1001, Alice Brown, 5001, 2023-08-01, 5002, 2023-08-15, 
1002, Bob White, 5003, 2023-08-05,

There’s no easy way to enforce that each order ID is unique and corresponds to the correct customer, which could lead to errors or duplicated orders.

6. Complex File Formats

  • Description: Managing and processing flat files with repeating groups often requires complex file formats.
  • Example: Custom delimiters or nested formats might be needed to handle repeating groups, making the file harder to understand and work with.
  • Problem: This increases the likelihood of errors during data entry, processing, or when the file is read by different systems.

Eg: You are storing customer orders in a flat file where each customer can place multiple orders.

CustomerID, CustomerName, Order1ID, Order1Date, Order2ID, Order2Date, Order3ID, Order3Date
1001, Alice Brown, 5001, 2023-08-01, 5002, 2023-08-15, 
1002, Bob White, 5003, 2023-08-05, 

As the number of orders grows, the file format becomes increasingly complex, requiring custom scripts to manage and extract order data for each customer.

7. Lack of Referential Integrity

  • Description: Flat files lack mechanisms to enforce referential integrity between related groups of data.
  • Example: Ensuring that a skill listed in one file corresponds to a valid skill ID in another file requires manual checks or complex logic.
  • Problem: This can lead to orphaned records or mismatches between related data sets.

Eg: A fleet management company tracks maintenance records for each vehicle in a flat file. Each vehicle can have multiple maintenance records.

VehicleID, VehicleType, Maintenance1Date, Maintenance1Type, Maintenance2Date, Maintenance2Type
V001, Truck, 2023-01-15, Oil Change, 2023-03-10, Tire Rotation
V002, Van, 2023-02-20, Brake Inspection, , 

There’s no way to ensure that the Maintenance1Type and Maintenance2Type fields are valid maintenance types or that the dates are in correct chronological order.

8. Difficulty in Data Modification

  • Description: Modifying data in flat files with repeating groups can be complex and error-prone.
  • Example: Adding or removing an item from a repeating group might require extensive manual edits across multiple records.
  • Problem: This increases the risk of errors and makes data management time-consuming.

Eg: A university maintains a flat file to record student enrollments in courses. Each student can enroll in multiple courses.

StudentID, StudentName, Course1, Course2, Course3, Course4, Course5
2001, Charlie Green, Math101, Physics102, , , 
2002, Dana Blue, History101, Math101, Chemistry101, , 

If a student drops a course or switches to a different one, manually editing the file can easily lead to errors, especially as the number of students and courses increases.

After listing down all these, Saravanan started looking into solutions. His search goes on…

What is Relational Database and Postgres Sql ?

In the city of Data, the citizens relied heavily on organizing their information. The city was home to many different types of data numbers, names, addresses, and even some exotic types like images and documents. But as the city grew, so did the complexity of managing all this information.

One day, the city’s leaders called a meeting to discuss how best to handle the growing data. They were split between two different systems

  1. the old and trusted Relational Database Management System (RDBMS)
  2. the new, flashy NoSQL databases.

Enter Relational Databases:

Relational databases were like the city’s libraries. They had rows of neatly organized shelves (tables) where every book (data entry) was placed according to a specific category (columns).

Each book had a unique ID (primary key) so that anyone could find it quickly. These libraries had been around for decades, and everyone knew how to use them.

The RDBMS was more than just a library. It enforced rules (constraints) to ensure that no book went missing, was duplicated, or misplaced. It even allowed librarians (queries) to connect different books using relationships (joins).

If you wanted to find all the books by a particular author that were published in the last five years, the RDBMS could do it in a heartbeat.

The Benefits of RDBMS:

The citizens loved the RDBMS because it was:

  1. Organized: Everything was in its place, and data was easy to find.
  2. Reliable: The rules ensured data integrity, so they didn’t have to worry about inconsistencies.
  3. Powerful: It could handle complex queries, making it easy to get insights from their data.
  4. Secure: Access to the data could be controlled, keeping it safe from unauthorized users.

The Rise of NoSQL:

But then came the NoSQL databases, which were more like vast, sprawling warehouses. These warehouses didn’t care much about organization; they just stored everything in a big open space. You could toss in anything, and it would accept it—no need for strict categories or relationships. This flexibility appealed to the tech-savvy citizens who wanted to store newer, more diverse types of data like social media posts, images, and videos.

NoSQL warehouses were fast. They could handle enormous amounts of data without breaking a sweat and were perfect for real-time applications like chat systems and analytics.

The PostgreSQL Advantage:

PostgreSQL was a superstar in the world of RDBMS. It combined the organization and reliability of traditional relational databases with some of the flexibility of NoSQL. It allowed citizens to store structured data in tables while also offering support for unstructured data types like JSON. This made PostgreSQL a versatile choice, bridging the gap between the old and new worlds.

For installing postgres : https://www.postgresql.org/download/

The Dilemma: PostgreSQL vs. NoSQL:

The city faced a dilemma. Should they stick with PostgreSQL, which offered the best of both worlds, or fully embrace NoSQL for its speed and flexibility? The answer wasn’t simple. It depended on what the city valued more: the structured, reliable nature of PostgreSQL or the unstructured, flexible approach of NoSQL.

For applications that required strict data integrity and complex queries, PostgreSQL was the way to go. But for projects that needed to handle massive amounts of unstructured data quickly, NoSQL was the better choice.

Conclusion:

In the end, the city of Data realized that there was no one-size-fits-all solution. They decided to use PostgreSQL for applications where data relationships and integrity were crucial, and NoSQL for those that required speed and flexibility with diverse data types.

And so, the citizens of Data lived happily, managing their information with the right tools for the right tasks, knowing that both systems had their place in the ever-growing city.

Tool: Serial Activity – Remote SSH Manager

Why this tool was created ?

During our college times, we had a crash course on Machine Learning. Our coordinators has arranged an ML Engineer to take class for 3 days. He insisted to install packages to have hands-on experience. But unfortunately many of our people were not sure about the installations of the packages. So we need to find a solution to install all necessary packages in all machines.

We had a scenario like, all the machines had one specific same user account with same password for all the machines. So we were like; if we are able to automate it in one machine then it would be easy for rest of the machines ( Just a for-loop iterating the x.0.0.1 to x.0.0.255 ). This is the birthplace of this tool.

Code=-

#!/usr/bin/env python
import sys
import os.path
from multiprocessing.pool import ThreadPool

import paramiko

BASE_ADDRESS = "192.168.7."
USERNAME = "t1"
PASSWORD = "uni1"


def create_client(hostname):
    """Create a SSH connection to a given hostname."""
    ssh_client = paramiko.SSHClient()
    ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    ssh_client.connect(hostname=hostname, username=USERNAME, password=PASSWORD)
    ssh_client.invoke_shell()
    return ssh_client


def kill_computer(ssh_client):
    """Power off a computer."""
    ssh_client.exec_command("poweroff")


def install_python_modules(ssh_client):
    """Install the programs specified in requirements.txt"""
    ftp_client = ssh_client.open_sftp()

    # Move over get-pip.py
    local_getpip = os.path.expanduser("~/lab_freak/get-pip.py")
    remote_getpip = "/home/%s/Documents/get-pip.py" % USERNAME
    ftp_client.put(local_getpip, remote_getpip)

    # Move over requirements.txt
    local_requirements = os.path.expanduser("~/lab_freak/requirements.txt")
    remote_requirements = "/home/%s/Documents/requirements.txt" % USERNAME
    ftp_client.put(local_requirements, remote_requirements)

    ftp_client.close()

    # Install pip and the desired modules.
    ssh_client.exec_command("python %s --user" % remote_getpip)
    ssh_client.exec_command("python -m pip install --user -r %s" % remote_requirements)


def worker(action, hostname):
    try:
        ssh_client = create_client(hostname)

        if action == "kill":
            kill_computer(ssh_client)
        elif action == "install":
            install_python_modules(ssh_client)
        else:
            raise ValueError("Unknown action %r" % action)
    except BaseException as e:
        print("Running the payload on %r failed with %r" % (hostname, action))


def main():
    if len(sys.argv) < 2:
        print("USAGE: python kill.py ACTION")
        sys.exit(1)

    hostnames = [str(BASE_ADDRESS) + str(i) for i in range(30, 60)]

    with ThreadPool() as pool:
        pool.map(lambda hostname: worker(sys.argv[1], hostname), hostnames)


if __name__ == "__main__":
    main()


Tasks – Docker

  1. Install Docker on your local machine. Verify the installation by running the hello-world container.
  2. Pull the nginx image from Docker Hub and run it as a container. Map port 80 of the container to port 8080 of your host.
  3. Create a Dockerfile for a simple Node.js application that serves “Hello World” on port 3000. Build the Docker image with tag my-node-app and run a container. Below is the sample index.js file.

const express = require('express');
const app = express();

app.get('/', (req, res) => {
    res.send('Hello World');
});

app.listen(3000, () => {
    console.log('Server is running on port 3000');
});

4. Tag the Docker image my-node-app from Task 3 with a version tag v1.0.0.

5. Push the tagged image from Task 4 to your Docker Hub repository.

6. Run a container from the ubuntu image and start an interactive shell session inside it. You can run commands like ls, pwd, etc.

7. Create a Dockerfile for a Go application that uses multi-stage builds to reduce the final image size. The application should print “Hello Docker”. Sample Go code.


package main

import "fmt"

func main() {
    fmt.Println("Hello Docker")
}

8. Create a Docker volume and use it to persist data for a MySQL container. Verify that the data persists even after the container is removed. Try creating a test db.

9. Create a custom Docker network and run two containers (e.g., nginx and mysql) on that network. Verify that they can communicate with each other.

10. Create a docker-compose.yml file to define and run a multi-container Docker application with nginx as a web server and mysql as a database.

11. Scale the nginx service in the previous Docker Compose setup to run 3 instances.

12. Create a bind mount to share data between your host system and a Docker container running nginx. Modify a file on your host and see the changes reflected in the container.

13. Add a health check to a Docker container running a simple Node.js application. The health check should verify that the application is running and accessible.

Sample Healthcheck API in node.js,


const express = require('express');
const app = express();

// A simple route to return the status of the application
app.get('/health', (req, res) => {
    res.status(200).send('OK');
});

// Example main route
app.get('/', (req, res) => {
    res.send('Hello, Docker!');
});

// Start the server on port 3000
const port = 3000;
app.listen(port, () => {
    console.log(`App is running on http://localhost:${port}`);
});

14. Modify a Dockerfile to take advantage of Docker’s build cache, ensuring that layers that don’t change are reused.

15. Run a PostgreSQL database in a Docker container and connect to it using a database client from your host.

16. Create a custom Docker network and run a Node.js application and a MongoDB container on the same network. The Node.js application should connect to MongoDB using the container name.


const mongoose = require('mongoose');

mongoose.connect('mongodb://mongodb:27017/mydatabase', {
  useNewUrlParser: true,
  useUnifiedTopology: true,
}).then(() => {
  console.log('Connected to MongoDB');
}).catch(err => {
  console.error('Connection error', err);
});

17. Create a docker-compose.yml file to set up a MEAN (MongoDB, Express.js, Angular, Node.js) stack with services for each component.

18. Use the docker stats command to monitor resource usage (CPU, memory, etc.) of running Docker containers.

docker run -d --name busybox1 busybox sleep 1000
docker run -d --name busybox2 busybox sleep 1000

19. Create a Dockerfile for a simple Python Flask application that serves “Hello World”.

20. Configure Nginx as a reverse proxy to forward requests to a Flask application running in a Docker container.

21. Use docker exec to run a command inside a running container.


docker run -d --name ubuntu-container ubuntu sleep infinity

22. Modify a Dockerfile to create and use a non-root user inside the container.

23. Use docker logs to monitor the output of a running container.

24. Use docker system prune to remove unused Docker data (e.g., stopped containers, unused networks).

25. Run a Docker container in detached mode and verify that it’s running in the background.

26. Configure a Docker container to use a different logging driver (e.g., json-file or syslog).

27. Use build arguments in a Dockerfile to customize the build process.


from flask import Flask

app = Flask(__name__)

@app.route('/')
def hello_world():
    return 'Hello, Docker Build Arguments!'

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

28. Set CPU and memory limits for a Docker container (for busybox)

Docker Ep 12 – Cheatsheet

Here’s a Docker cheat sheet that covers the most commonly used Docker commands, organized by categories.

Docker Basics

  • docker --version: Show the Docker version installed on your system.
  • docker info: Display system-wide information, including Docker version, number of containers, and images.
  • docker help: Get help on Docker commands.

Docker Images

  • docker images: List all Docker images on your system.
  • docker pull <image>: Download an image from a Docker registry (e.g., Docker Hub).
  • docker build -t <image_name> .: Build an image from a Dockerfile in the current directory and tag it with a name.
  • docker tag <image_id> <new_image_name>: Tag an image with a new name.
  • docker rmi <image>: Remove one or more images.
  • docker history <image>: Show the history of an image (layers).

Docker Containers

  • docker ps: List all running containers.
  • docker ps -a: List all containers (running and stopped).
  • docker run <image>: Run a container from an image.
  • docker run -d <image>: Run a container in detached mode (in the background).
  • docker run -it <image>: Run a container in interactive mode with a terminal.
  • docker run -p <host_port>:<container_port> <image>: Map a port from the host to the container.
  • docker stop <container>: Stop a running container.
  • docker start <container>: Start a stopped container.
  • docker restart <container>: Restart a running container.
  • docker rm <container>: Remove a stopped container.
  • docker logs <container>: View the logs of a container.
  • docker exec -it <container> <command>: Execute a command inside a running container (e.g., bash to open a shell).

Docker Networks

  • docker network ls: List all Docker networks.
  • docker network create <network_name>: Create a new Docker network.
  • docker network inspect <network_name>: View detailed information about a network.
  • docker network connect <network_name> <container>: Connect a container to a network.
  • docker network disconnect <network_name> <container>: Disconnect a container from a network.
  • docker network rm <network_name>: Remove a Docker network.

Docker Volumes

  • docker volume ls: List all Docker volumes.
  • docker volume create <volume_name>: Create a new Docker volume.
  • docker volume inspect <volume_name>: View detailed information about a volume.
  • docker volume rm <volume_name>: Remove a Docker volume.
  • docker run -v <volume_name>:<container_path> <image>: Mount a volume inside a container.

Docker Compose

  • docker-compose up: Start the services defined in a docker-compose.yml file.
  • docker-compose down: Stop and remove containers, networks, volumes, and images created by docker-compose up.
  • docker-compose build: Build or rebuild services defined in a docker-compose.yml file.
  • docker-compose ps: List containers managed by Docker Compose.
  • docker-compose logs: View logs for services managed by Docker Compose.
  • docker-compose exec <service> <command>: Execute a command in a running service.

Dockerfile Directives

  • FROM: Specifies the base image.
  • WORKDIR: Sets the working directory inside the container.
  • COPY: Copies files from the host to the container.
  • RUN: Executes a command in the container.
  • CMD: Specifies the command to run when the container starts.
  • EXPOSE: Specifies the port on which the container will listen.
  • ENV: Sets environment variables.
  • ENTRYPOINT: Configures the container to run as an executable.

Docker Cleanup Commands

  • docker system prune: Remove unused data (stopped containers, unused networks, dangling images, etc.).
  • docker container prune: Remove all stopped containers.
  • docker image prune: Remove unused images.
  • docker volume prune: Remove all unused volumes.
  • docker network prune: Remove all unused networks.

Miscellaneous

  • docker inspect <container_or_image>: Return low-level information on Docker objects (containers, images, volumes, etc.).
  • docker stats: Display a live stream of container(s) resource usage statistics.
  • docker top <container>: Display the running processes of a container.
  • docker cp <container>:<path> <local_path>: Copy files from a container to the host or vice versa.

Docker EP 11 – Docker Networking & Docker Volumes

Alex is tasked with creating a new microservices-based web application for a growing e-commerce platform. The application needs to handle everything from user authentication to inventory management, and you decide to use Docker to containerize the different services.

Here are the services code with their dockerfile,

Auth Service (auth-service)

This service handles user authentication,


# auth-service.py
from flask import Flask, request, jsonify

app = Flask(__name__)

# Dummy user database
users = {
    "user1": "password1",
    "user2": "password2"
}

@app.route('/login', methods=['POST'])
def login():
    data = request.json
    username = data.get('username')
    password = data.get('password')
    
    if username in users and users[username] == password:
        return jsonify({"message": "Login successful"}), 200
    else:
        return jsonify({"message": "Invalid credentials"}), 401

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Dockerfile:

# Use the official Python image.
FROM python:3.9-slim

# Set the working directory in the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install any needed packages specified in requirements.txt
RUN pip install flask

# Make port 5000 available to the world outside this container
EXPOSE 5000

# Define environment variable
ENV NAME auth-service

# Run app.py when the container launches
CMD ["python", "auth_service.py"]


Inventory Service (inventory-service)

This service manages inventory data, inventory_service.py

from flask import Flask, request, jsonify

app = Flask(__name__)

# Dummy inventory database
inventory = {
    "item1": {"name": "Item 1", "quantity": 10},
    "item2": {"name": "Item 2", "quantity": 5}
}

@app.route('/inventory', methods=['GET'])
def get_inventory():
    return jsonify(inventory), 200

@app.route('/inventory/<item_id>', methods=['POST'])
def update_inventory(item_id):
    data = request.json
    if item_id in inventory:
        inventory[item_id]["quantity"] = data.get("quantity")
        return jsonify({"message": "Inventory updated"}), 200
    else:
        return jsonify({"message": "Item not found"}), 404

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5001)


Dockerfile

# Use the official Python image.
FROM python:3.9-slim

# Set the working directory in the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install any needed packages specified in requirements.txt
RUN pip install flask

# Make port 5001 available to the world outside this container
EXPOSE 5001

# Define environment variable
ENV NAME inventory-service

# Run inventory_service.py when the container launches
CMD ["python", "inventory_service.py"]

Dev Service (dev-service)

This service could be a simple service used during development for testing or managing files, dev-service.py


from flask import Flask, request, jsonify
import os

app = Flask(__name__)

@app.route('/files', methods=['GET'])
def list_files():
    files = os.listdir('/app/data')
    return jsonify(files), 200

@app.route('/files/<filename>', methods=['GET'])
def read_file(filename):
    try:
        with open(f'/app/data/{filename}', 'r') as file:
            content = file.read()
        return jsonify({"filename": filename, "content": content}), 200
    except FileNotFoundError:
        return jsonify({"message": "File not found"}), 404

@app.route('/files/<filename>', methods=['POST'])
def write_file(filename):
    data = request.json.get("content", "")
    with open(f'/app/data/{filename}', 'w') as file:
        file.write(data)
    return jsonify({"message": f"File {filename} written successfully"}), 200

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5002)

Dockerfile


# Use the official Python image.
FROM python:3.9-slim

# Set the working directory in the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install any needed packages specified in requirements.txt
RUN pip install flask

# Make port 5002 available to the world outside this container
EXPOSE 5002

# Define environment variable
ENV NAME dev-service

# Run dev_service.py when the container launches
CMD ["python", "dev_service.py"]

Auth Service: http://localhost:5000/login (POST request with JSON {"username": "user1", "password": "password1"})

Inventory Service: http://localhost:5001/inventory (GET or POST request)

Dev Service:

  • List files: http://localhost:5002/files (GET request)
  • Read file: http://localhost:5002/files/<filename> (GET request)
  • Write file: http://localhost:5002/files/<filename> (POST request with JSON {"content": "Your content here"})

The Lonely Container

You start by creating a simple Flask application for user authentication. After writing the code, you decide to containerize it using Docker.

docker build -t auth-service .
docker run -d -p 5000:5000 --name auth-service auth-service

The service is up and running, and you can access it at http://localhost:5000. But there’s one problem—it’s lonely. Your auth-service is a lone container in the vast sea of Docker networking. If you want to add more services, they need a way to communicate with each other.

  1. docker build -t auth-service .
  • This command builds a Docker image from the Dockerfile in the current directory (.) and tags it as auth-service.

2. docker run -d -p 5000:5000 --name auth-service auth-service

  • -d: Runs the container in detached mode (in the background).
  • -p 5000:5000: Maps port 5000 on the host to port 5000 in the container, making the Flask app accessible at http://localhost:5000.
  • --name auth-service: Names the container auth-service.
  • auth-service: The name of the image to run.

The Bridge of Communication

You decide to expand the application by adding a new inventory service. But how will these two services talk to each other? Enter the bridge network a magical construct that allows containers to communicate within their own private world.

You create a user-defined bridge network to allow your containers to talk to each other by name rather than by IP address.

docker network create ecommerce-network
docker run -d --name auth-service --network ecommerce-network auth-service
docker run -d --name inventory-service --network ecommerce-network inventory-service

Now, your services are not lonely anymore. The auth-service can talk to the inventory-service simply by using its name, like calling a friend across the room. In your code, you can reference inventory-service by name to establish a connection.

docker network create ecommerce-network

  • Creates a user-defined bridge network called ecommerce-network. This network allows containers to communicate with each other using their container names as hostnames.

docker run -d --name auth-service --network ecommerce-network auth-service

  • Runs the auth-service container on the ecommerce-network. The container can now communicate with other containers on the same network using their names.

docker run -d --name inventory-service --network ecommerce-network inventory-service

  • Runs the inventory-service container on the ecommerce-network. The auth-service container can now communicate with the inventory-service using the name inventory-service.

The City of Services

As your project grows, you realize that your application will eventually need to scale. Some services will run on different servers, possibly in different data centers. How will they communicate? It’s time to build a city—a network that spans multiple hosts.

You decide to use Docker Swarm, a tool that lets you manage a cluster of Docker hosts. You create an overlay network, a mystical web that allows containers across different servers to communicate as if they were right next to each other.

docker network create -d overlay ecommerce-overlay
docker service create --name auth-service --network ecommerce-overlay auth-service
docker service create --name inventory-service --network ecommerce-overlay inventory-service

Now, no matter where your containers are running, they can still talk to each other. It’s like giving each container a magic phone that works anywhere in the world.

docker network create -d overlay ecommerce-overlay

  • Creates an overlay network named ecommerce-overlay. Overlay networks are used for multi-host communication, typically in a Docker Swarm or Kubernetes environment.

docker service create --name auth-service --network ecommerce-overlay auth-service

  • Deploys the auth-service as a service on the ecommerce-overlay network. Services are used in Docker Swarm to manage containers across multiple hosts.

docker service create --name inventory-service --network ecommerce-overlay inventory-service

  • Deploys the inventory-service as a service on the ecommerce-overlay network, allowing it to communicate with the auth-service even if they are running on different physical or virtual machines.

The Treasure Chest of Data

Your services are talking, but they need to remember things—like user data and inventory levels. Enter the Docker volumes, the treasure chests where your containers can store their precious data.

For your inventory-service, you create a volume to store all the inventory information,

docker volume create inventory-data
docker run -d --name inventory-service --network ecommerce-network -v inventory-data:/app/data inventory-service

Now, even if your inventory-service container is destroyed and replaced, the data remains safe in the inventory-data volume. It’s like having a secure vault where you keep all your valuables.

docker volume create inventory-data

  • Creates a named Docker volume called inventory-data. Named volumes persist data even if the container is removed, and they can be shared between containers.

docker run -d --name inventory-service --network ecommerce-network -v inventory-data:/app/data inventory-service

  • -v inventory-data:/app/data: Mounts the inventory-data volume to the /app/data directory inside the container. Any data written to /app/data inside the container is stored in the inventory-data volume.

The Hidden Pathway

Sometimes, you need to work directly with files on your host machine, like when debugging or developing. You create a bind mount, a secret pathway between your host and the container.

docker run -d --name dev-service --network ecommerce-network -v $(pwd)/data:/app/data dev-service

Now, as you make changes to files in your host’s data directory, those changes are instantly reflected in your container. It’s like having a secret door in your house that opens directly into your office at work.

-v $(pwd)/data:/app/data:

  • This creates a bind mount, where the data directory in the current working directory on the host ($(pwd)/data) is mounted to /app/data inside the container. Changes made to files in the data directory on the host are reflected inside the container and vice versa. This is particularly useful for development, as it allows you to edit files on your host and see the changes immediately inside the running container.

The Seamless City

As your application grows, Docker Compose comes into play. It’s like the city planner, helping you manage all the roads (networks) and buildings (containers) in your bustling metropolis. With a simple docker-compose.yml file, you define your entire application stack,

version: '3'
services:
  auth-service:
    image: auth-service
    networks:
      - ecommerce-network
  inventory-service:
    image: inventory-service
    networks:
      - ecommerce-network
    volumes:
      - inventory-data:/app/data

networks:
  ecommerce-network:

volumes:
  inventory-data:

  1. version: '3': Specifies the version of the Docker Compose file format.
  2. services:: Defines the services (containers) that make up your application.
  • auth-service:: Defines the auth-service container.
    • image: auth-service: Specifies the Docker image to use for this service.
    • networks:: Specifies the networks this service is connected to.
  • inventory-service:: Defines the inventory-service container.
    • volumes:: Specifies the volumes to mount. Here, the inventory-data volume is mounted to /app/data inside the container.

3. networks:: Defines the networks used by the services. ecommerce-network is the custom bridge network created for communication between the services.

4. volumes:: Defines the volumes used by the services. inventory-data is a named volume used by the inventory-service.

Now, you can start your entire city with a single command,

docker-compose up

Everything springs to life services find each other, data is stored securely, and your city of containers runs like a well-oiled machine.

❌