Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Can UV Transform Python Scripts into Standalone Executables ?

17 February 2025 at 17:48

Managing dependencies for small Python scripts has always been a bit of a hassle.

Traditionally, we either install packages globally (not recommended) or create a virtual environment, activate it, and install dependencies manually.

But what if we could run Python scripts like standalone binaries ?

Introducing PEP 723 – Inline Script Metadata

PEP 723 (https://peps.python.org/pep-0723/) introduces a new way to specify dependencies directly within a script, making it easier to execute standalone scripts without dealing with external dependency files.

This is particularly useful for quick automation scripts or one-off tasks.

Consider a script that interacts with an API requiring a specific package,

# /// script
# requires-python = ">=3.11"
# dependencies = [
#   "requests",
# ]
# ///

import requests
response = requests.get("https://api.example.com/data")
print(response.json())

Here, instead of manually creating a requirements.txt or setting up a virtual environment, the dependencies are defined inline. When using uv, it automatically installs the required packages and runs the script just like a binary.

Running the Script as a Third-Party Tool

With uv, executing the script feels like running a compiled binary,

$ uv run fetch-data.py
Reading inline script metadata from: fetch-data.py
Installed dependencies in milliseconds

ehind the scenes, uv creates an isolated environment, ensuring a clean dependency setup without affecting the global Python environment. This allows Python scripts to function as independent tools without any manual dependency management.

Why This Matters

This approach makes Python an even more attractive choice for quick automation tasks, replacing the need for complex setups. It allows scripts to be shared and executed effortlessly, much like compiled executables in other programming environments.

By leveraging uv, we can streamline our workflow and use Python scripts as powerful, self-contained tools without the usual dependency headaches.

Avoid Cache Pitfalls: Key Problems and Fixes

16 February 2025 at 09:22

Caching is an essential technique for improving application performance and reducing the load on databases. However, improper caching strategies can lead to serious issues.

I got inspired from ByteByteGo https://www.linkedin.com/posts/bytebytego_systemdesign-coding-interviewtips-activity-7296767687978827776-Dizz

In this blog, we will discuss four common cache problems: Thundering Herd Problem, Cache Penetration, Cache Breakdown, and Cache Crash, along with their causes, consequences, and solutions.

  1. Thundering Herd Problem
    1. What is it?
    2. Example Scenario
    3. Solutions
  2. Cache Penetration
    1. What is it?
    2. Example Scenario
    3. Solutions
  3. Cache Breakdown
    1. What is it?
    2. Example Scenario
    3. Solutions
  4. Cache Crash
    1. What is it?
    2. Example Scenario
    3. Solutions

Thundering Herd Problem

What is it?

The Thundering Herd Problem occurs when a large number of keys in the cache expire at the same time. When this happens, all requests bypass the cache and hit the database simultaneously, overwhelming it and causing performance degradation or even a system crash.

Example Scenario

Imagine an e-commerce website where product details are cached for 10 minutes. If all the products’ cache expires at the same time, thousands of users sending requests will cause an overwhelming load on the database.

Solutions

  1. Staggered Expiration: Instead of setting a fixed expiration time for all keys, introduce a random expiry variation.
  2. Allow Only Core Business Queries: Limit direct database access only to core business data, while returning stale data or temporary placeholders for less critical data.
  3. Lazy Rebuild Strategy: Instead of all requests querying the database, the first request fetches data and updates the cache while others wait.
  4. Batch Processing: Queue multiple requests and process them in batches to reduce database load.

Cache Penetration

What is it?

Cache Penetration occurs when requests are made for keys that neither exist in the cache nor in the database. Since these requests always hit the database, they put excessive pressure on the system.

Example Scenario

A malicious user could attempt to query random user IDs that do not exist, forcing the system to repeatedly query the database and skip the cache.

Solutions

  1. Cache Null Values: If a key does not exist in the database, store a null value in the cache to prevent unnecessary database queries.
  2. Use a Bloom Filter: A Bloom filter helps check whether a key exists before querying the database. If the Bloom filter does not contain the key, the request is discarded immediately.
  3. Rate Limiting: Implement request throttling to prevent excessive access to non-existent keys.
  4. Data Prefetching: Predict and load commonly accessed data into the cache before it is needed.

Cache Breakdown

What is it?

Cache Breakdown is similar to the Thundering Herd Problem, but it occurs specifically when a single hot key (a frequently accessed key) expires. This results in a surge of database queries as all users try to retrieve the same data.

Example Scenario

A social media platform caches trending hashtags. If the cache expires, millions of users will query the same hashtag at once, hitting the database hard.

Solutions

  1. Never Expire Hot Keys: Keep hot keys permanently in the cache unless an update is required.
  2. Preload the Cache: Refresh the cache asynchronously before expiration by setting a background task to update the cache regularly.
  3. Mutex Locking: Ensure only one request updates the cache, while others wait for the update to complete.
  4. Double Buffering: Maintain a secondary cache layer to serve requests while the primary cache is being refreshed.

Cache Crash

What is it?

A Cache Crash occurs when the cache service itself goes down. When this happens, all requests fall back to the database, overloading it and causing severe performance issues.

Example Scenario

If a Redis instance storing session data for a web application crashes, all authentication requests will be forced to hit the database, leading to a potential outage.

Solutions

  1. Cache Clustering: Use a cluster of cache nodes instead of a single instance to ensure high availability.
  2. Persistent Storage for Cache: Enable persistence modes like Redis RDB or AOF to recover data quickly after a crash.
  3. Automatic Failover: Configure automated failover with tools like Redis Sentinel to ensure availability even if a node fails.
  4. Circuit Breaker Mechanism: Prevent the application from directly accessing the database if the cache is unavailable, reducing the impact of a crash.
class CircuitBreaker:
    def __init__(self, failure_threshold=5):
        self.failure_count = 0
        self.failure_threshold = failure_threshold
    
    def call(self, func, *args, **kwargs):
        if self.failure_count >= self.failure_threshold:
            return "Service unavailable"
        try:
            return func(*args, **kwargs)
        except Exception:
            self.failure_count += 1
            return "Error"

Caching is a powerful mechanism to improve application performance, but improper strategies can lead to severe bottlenecks. Problems like Thundering Herd, Cache Penetration, Cache Breakdown, and Cache Crash can significantly degrade system reliability if not handled properly.

Learning Notes #53 – The Expiration Time Can Be Unexpectedly Lost While Using Redis SET EX

12 January 2025 at 09:14

Redis, a high-performance in-memory key-value store, is widely used for caching, session management, and various other scenarios where fast data retrieval is essential. One of its key features is the ability to set expiration times for keys. However, when using the SET command with the EX option, developers might encounter unexpected behaviors where the expiration time is seemingly lost. Let’s explore this issue in detail.

Understanding SET with EX

The Redis SET command with the EX option allows you to set a key’s value and specify its expiration time in seconds. For instance


SET key value EX 60

This command sets the key key to the value value and sets an expiration time of 60 seconds.

The Problem

In certain cases, the expiration time might be unexpectedly lost. This typically happens when subsequent operations overwrite the key without specifying a new expiration. For example,


SET key value1 EX 60
SET key value2

In the above sequence,

  1. The first SET command assigns a value to key and sets an expiration of 60 seconds.
  2. The second SET command overwrites the value of key but does not include an expiration time, resulting in the key persisting indefinitely.

This behavior can lead to subtle bugs, especially in applications that rely on key expiration for correctness or resource management.

Why Does This Happen?

The Redis SET command is designed to replace the entire state of a key, including its expiration. When you use SET without the EX, PX, or EXAT options, the expiration is removed, and the key becomes persistent. This behavior aligns with the principle that SET is a complete update operation.

When using Redis SET with EX, be mindful of operations that might overwrite keys without reapplying expiration. Understanding Redis’s behavior and implementing robust patterns can save you from unexpected issues, ensuring your application remains efficient and reliable.

Learning Notes #18 – Bulk Head Pattern (Resource Isolation) | Cloud Pattern

30 December 2024 at 17:48

Today, i learned about bulk head pattern and how it makes the system resilient to failure, resource exhaustion. In this blog i jot down notes on this pattern for better understanding.

In today’s world of distributed systems and microservices, resiliency is key to ensuring applications are robust and can withstand failures.

The Bulkhead Pattern is a design principle used to improve system resilience by isolating different parts of a system to prevent failure in one component from cascading to others.

What is the Bulkhead Pattern?

The term “bulkhead” originates from shipbuilding, where bulkheads are partitions that divide a ship into separate compartments. If one compartment is breached, the others remain intact, preventing the entire ship from sinking. Similarly, in software design, the Bulkhead Pattern isolates components or services so that a failure in one part does not bring down the entire system.

In software systems, bulkheads:

  • Isolate resources (e.g., threads, database connections, or network calls) for different components.
  • Limit the scope of failures.
  • Allow other parts of the system to continue functioning even if one part is degraded or completely unavailable.

Example

Consider an e-commerce application with a product-service that has two endpoints

  1. /product/{id} – This endpoint gives detailed information about a specific product, including ratings and reviews. It depends on the rating-service.
  2. /products – This endpoint provides a catalog of products based on search criteria. It does not depend on any external services.

Consider, with a fixed amount of resource allocated to product-service is loaded with /product/{id} calls, then they can monopolize the thread pool. This delays /products requests, causing users to experience slowness even though these requests are independent. Which leads to resource exhaustion and failures.

With bulkhead pattern, we can allocate separate client, connection pools to isolate the service interaction. we can implement bulkhead by allocating some connection pool (10) to /product/{id} requests and /products requests have a different connection pool (5) .

Even if /product/{id} requests are slow or encounter high traffic, /products requests remain unaffected.

Scenarios Where the Bulkhead Pattern is Needed

  1. Microservices with Shared Resources – In a microservices architecture, multiple services might share limited resources such as database connections or threads. If one service experiences a surge in traffic or a failure, it can exhaust these shared resources, impacting all other services. Bulkheading ensures each service gets a dedicated pool of resources, isolating the impact of failures.
  2. Prioritizing Critical Workloads – In systems with mixed workloads (e.g., processing user transactions and generating reports), critical operations like transaction processing must not be delayed or blocked by less critical tasks. Bulkheading allocates separate resources to ensure critical tasks have priority.
  3. Third-Party API Integration – When an application depends on multiple external APIs, one slow or failing API can delay the entire application if not isolated. Using bulkheads ensures that issues with one API do not affect interactions with others.
  4. Multi-Tenant Systems – In SaaS applications serving multiple tenants, a single tenant’s high resource consumption or failure should not degrade the experience for others. Bulkheads can segregate resources per tenant to maintain service quality.
  5. Cloud-Native Applications – In cloud environments, services often scale independently. A spike in one service’s load should not overwhelm shared backend systems. Bulkheads help isolate and manage these spikes.
  6. Event-Driven Systems – In event-driven architectures with message queues, processing backlogs for one type of event can delay others. By applying the Bulkhead Pattern, separate processing pipelines can handle different event types independently.

What are the Key Points of the Bulkhead Pattern? (Simplified)

  • Define Partitions – (Think of a ship) it’s divided into compartments (partitions) to keep water from flooding the whole ship if one section gets damaged. In software, these partitions are designed around how the application works and its technical needs.
  • Designing with Context – If you’re using a design approach like DDD (Domain-Driven Design), make sure your bulkheads (partitions) match the business logic boundaries.
  • Choosing Isolation Levels – Decide how much isolation is needed. For example: Threads for lightweight tasks. Separate containers or virtual machines for more critical separations. Balance between keeping things separate and the costs or extra effort involved.
  • Combining Other Techniques – Bulkheads work even better with patterns like Retry, Circuit Breaker, Throttling.
  • Monitoring – Keep an eye on each partition’s performance. If one starts getting overloaded, you can adjust resources or change limits.

When Should You Use the Bulkhead Pattern?

  • To Isolate Critical Resources – If one part of your system fails, other parts can keep working. For example, you don’t want search functionality to stop working because the reviews section is down.
  • To Prioritize Important Work – For example, make sure payment processing (critical) is separate from background tasks like sending emails.
  • To Avoid Cascading Failures – If one part of the system gets overwhelmed, it won’t drag down everything else.

When Should You Avoid It?

  • Complexity Isn’t Needed – If your system is simple, adding bulkheads might just make it harder to manage.
  • Resource Efficiency is Critical – Sometimes, splitting resources into separate pools can mean less efficient use of those resources. If every thread, connection, or container is underutilized, this might not be the best approach.

Challenges and Best Practices

  1. Overhead: Maintaining separate resource pools can increase system complexity and resource utilization.
  2. Resource Sizing: Properly sizing the pools is critical to ensure resources are efficiently utilized without bottlenecks.
  3. Monitoring: Use tools to monitor the health and performance of each resource pool to detect bottlenecks or saturation.

References:

  1. AWS https://aws.amazon.com/blogs/containers/building-a-fault-tolerant-architecture-with-a-bulkhead-pattern-on-aws-app-mesh/
  2. Resilience https://resilience4j.readme.io/docs/bulkhead
  3. https://medium.com/nerd-for-tech/bulkhead-pattern-distributed-design-pattern-c673d5e81523
  4. Microsoft https://learn.microsoft.com/en-us/azure/architecture/patterns/bulkhead

Learning Notes #11 – Sidecar Pattern | Cloud Patterns

26 December 2024 at 17:40

Today, I learnt about Sidecar Pattern. Its seems like offloading the common functionalities (logging, networking, …) aside within a pod to be used by other apps within the pod.

Its just not only about pods, but other deployments aswell. In this blog, i am going to curate the items i have learnt for my future self. Its a pattern, not an strict rule.

What is a Sidecar?

Imagine you’re riding a motorbike, and you attach a little sidecar to carry your friend or groceries. The sidecar isn’t part of the motorbike’s engine or core mechanism, but it helps you achieve your goals—whether it’s carrying more stuff or having a buddy ride along.

In the software world, a sidecar is a similar concept. It’s a separate process or container that runs alongside a primary application. Like the motorbike’s sidecar, it supports the main application by offloading or enhancing certain tasks without interfering with its core functionality.

Why Use a Sidecar?

In traditional applications, all responsibilities (logging, communication, monitoring, etc.) are bundled into the main application. This approach can make the application complex and harder to manage. Sidecars address this by handling auxiliary tasks separately, so the main application can focus on its primary purpose.

Here are some key reasons to use a sidecar

  1. Modularity: Sidecars separate responsibilities, making the system easier to develop, test, and maintain.
  2. Reusability: The same sidecar can be used across multiple services. And its language agnostic.
  3. Scalability: You can scale the sidecar independently from the main application.
  4. Isolation: Sidecars provide a level of isolation, reducing the risk of one part affecting the other.

Real-Life Analogies

To make the concept clearer, here are some real-world analogies:

  1. Coffee Maker with a Milk Frother:
    • The coffee maker (main application) brews coffee.
    • The milk frother (sidecar) prepares frothed milk for your latte.
    • Both work independently but combine their outputs for a better experience.
  2. Movie Subtitles:
    • The movie (main application) provides the visuals and sound.
    • The subtitles (sidecar) add clarity for those who need them.
    • You can watch the movie with or without subtitles—they’re optional but enhance the experience.
  3. A School with a Sports Coach:
    • The school (main application) handles education.
    • The sports coach (sidecar) focuses on physical training.
    • Both have distinct roles but contribute to the overall development of students.

Some Random Sidecar Ideas in Software

Let’s look at how sidecars are used in actual software scenarios

  1. Service Meshes (e.g., Istio, Linkerd):
    • A service mesh helps microservices communicate with each other reliably and securely.
    • The sidecar (proxy like Envoy) handles tasks like load balancing, encryption, and monitoring, so the main application doesn’t have to.
  2. Logging and Monitoring:
    • Instead of the main application generating and managing logs, a sidecar can collect, format, and send logs to a centralized system like Elasticsearch or Splunk.
  3. Authentication and Security:
    • A sidecar can act as a gatekeeper, handling user authentication and ensuring that only authorized requests reach the main application.
  4. Data Caching:
    • If an application frequently queries a database, a sidecar can serve as a local cache, reducing database load and speeding up responses.
  5. Service Discovery:
    • Sidecars can aid in service discovery by automatically registering the main application with a registry service or load balancer, ensuring seamless communication in dynamic environments.

How Sidecars Work

In modern environments like Kubernetes, sidecars are often deployed as separate containers within the same pod as the main application. They share the same network and storage, making communication between the two seamless.

Here’s a simplified workflow

  1. The main application focuses on its core tasks (e.g., serving a web page).
  2. The sidecar handles auxiliary tasks (e.g., compressing and encrypting logs).
  3. The two communicate over local connections within the pod.

Pros and Cons of Sidecars

Pros:

  • Simplifies the main application.
  • Encourages reusability and modular design.
  • Improves scalability and flexibility.
  • Enhances observability with centralized logging and metrics.
  • Facilitates experimentation—you can deploy or update sidecars independently.

Cons:

  • Adds complexity to deployment and orchestration.
  • Consumes additional resources (CPU, memory).
  • Requires careful design to avoid tight coupling between the sidecar and the main application.
  • Latency (You are adding an another hop).

Do we always need to use sidecars

No. Not at all.

a. When there is a latency between the parent application and sidecar, then Reconsider.

b. If your application is small, then reconsider.

c. When you are scaling differently or independently from the parent application, then Reconsider.

Some other examples

1. Adding HTTPS to a Legacy Application

Consider a legacy web service which services requests over unencrypted HTTP. We have a requirement to enhance the same legacy system to service requests with HTTPS in future.

The legacy app is configured to serve request exclusively on localhost, which means that only services that share the local network with the server able to access legacy application. In addition to the main container (legacy app) we can add Nginx Sidecar container which runs in the same network namespace as the main container so that it can access the service running on localhost.

2. For Logging (Image from ByteByteGo)

Sidecars are not just technical solutions; they embody the principle of collaboration and specialization. By dividing responsibilities, they empower the main application to shine while ensuring auxiliary tasks are handled efficiently. Next time you hear about sidecars, you’ll know they’re more than just cool attachments for motorcycle they’re an essential part of scalable, maintainable software systems.

Also, do you feel its closely related to Adapter and Ambassador Pattern ? I Do.

References:

  1. Hussein Nasser – https://www.youtube.com/watch?v=zcJWvhzkPsw&pp=ygUHc2lkZWNhcg%3D%3D
  2. Sudo Code – https://www.youtube.com/watch?v=QU5WcwuFpZU&pp=ygUPc2lkZWNhciBwYXR0ZXJu
  3. Software Dude – https://www.youtube.com/watch?v=poPUzN33Oug&pp=ygUPc2lkZWNhciBwYXR0ZXJu
  4. https://medium.com/nerd-for-tech/microservice-design-pattern-sidecar-sidekick-pattern-dbcea9bed783
  5. https://dzone.com/articles/sidecar-design-pattern-in-your-microservices-ecosy-1

Locust EP 1 : Load Testing: Ensuring Application Reliability with Real-Time Examples and Metrics

14 November 2024 at 15:48

In today’s fast-paced digital application, delivering a reliable and scalable application is key to providing a positive user experience.

One of the most effective ways to guarantee this is through load testing. This post will walk you through the fundamentals of load testing, real-time examples of its application, and crucial metrics to watch for.

What is Load Testing?

Load testing is a type of performance testing that simulates real-world usage of an application. By applying load to a system, testers observe how it behaves under peak and normal conditions. The primary goal is to identify any performance bottlenecks, ensure the system can handle expected user traffic, and maintain optimal performance.

Load testing answers these critical questions:

  • Can the application handle the expected user load?
  • How does performance degrade as the load increases?
  • What is the system’s breaking point?

Why is Load Testing Important?

Without load testing, applications are vulnerable to crashes, slow response times, and unavailability, all of which can lead to a poor user experience, lost revenue, and brand damage. Proactive load testing allows teams to address issues before they impact end-users.

Real-Time Load Testing Examples

Let’s explore some real-world examples that demonstrate the importance of load testing.

Example 1: E-commerce Website During a Sale Event

An online retailer preparing for a Black Friday sale knows that traffic will spike. They conduct load testing to simulate thousands of users browsing, adding items to their cart, and checking out simultaneously. By analyzing the system’s response under these conditions, the retailer can identify weak points in the checkout process or database and make necessary optimizations.

Example 2: Video Streaming Platform Launch

A new streaming platform is preparing for launch, expecting millions of users. Through load testing, the team simulates high traffic, testing how well video streaming performs under maximum user load. This testing also helps check if CDN (Content Delivery Network) configurations are optimized for global access, ensuring minimal buffering and downtime during peak hours.

Example 3: Financial Services Platform During Market Hours

A trading platform experiences intense usage during market open and close hours. Load testing helps simulate these peak times, ensuring that real-time data updates, transactions, and account management work flawlessly. Testing for these scenarios helps avoid issues like slow trade executions and platform unavailability during critical trading periods.

Key Metrics to Monitor in Load Testing

Understanding key metrics is essential for interpreting load test results. Here are some critical metrics to focus on:

1. Response Time

  • Definition: The time taken by the system to respond to a request.
  • Why It Matters: Slow response times can frustrate users and indicate bottlenecks.
  • Example Thresholds: For websites, a response time below 2 seconds is considered acceptable.

2. Throughput

  • Definition: The number of requests processed per second.
  • Why It Matters: Throughput indicates how many concurrent users your application can handle.
  • Real-Time Use Case: In our e-commerce example, the retailer would track throughput to ensure the checkout process doesn’t become a bottleneck.

3. Error Rate

  • Definition: The percentage of failed requests out of total requests.
  • Why It Matters: A high error rate could indicate application instability under load.
  • Real-Time Use Case: The trading platform monitors the error rate during market close, ensuring the system doesn’t throw errors under peak trading load.

4. CPU and Memory Utilization

  • Definition: The percentage of CPU and memory resources used during the load test.
  • Why It Matters: High CPU or memory utilization can signal that the server may not handle additional load.
  • Real-Time Use Case: The video streaming platform tracks memory usage to prevent lag or interruptions in streaming as users increase.

5. Concurrent Users

  • Definition: The number of users active on the application at the same time.
  • Why It Matters: Concurrent users help you understand how much load the system can handle before performance starts degrading.
  • Real-Time Use Case: The retailer tests how many concurrent users can shop simultaneously without crashing the website.

6. Latency

  • Definition: The time it takes for a request to travel from the client to the server and back.
  • Why It Matters: High latency indicates network or processing delays that can slow down the user experience.
  • Real-Time Use Case: For a financial app, reducing latency ensures trades execute in near real-time, which is crucial for users during volatile market conditions.

7. 95th and 99th Percentile Response Times

  • Definition: The time within which 95% or 99% of requests are completed.
  • Why It Matters: These percentiles help identify outliers that may impact user experience.
  • Real-Time Use Case: The streaming service may analyze these percentiles to ensure smooth playback for most users, even under peak loads.

Best Practices for Effective Load Testing

  1. Set Clear Objectives: Define specific goals, such as the expected number of concurrent users or acceptable response times, based on the nature of the application.
  2. Use Realistic Load Scenarios: Create scenarios that mimic actual user behavior, including peak times, user interactions, and geographical diversity.
  3. Analyze Bottlenecks and Optimize: Use test results to identify and address performance bottlenecks, whether in the application code, database queries, or server configurations.
  4. Monitor in Real-Time: Track metrics like response time, throughput, and error rates in real-time to identify issues as they arise during the test.
  5. Repeat and Compare: Conduct multiple load tests to ensure consistent performance over time, especially after any significant update or release.

Load testing is crucial for building a resilient and scalable application. By using real-world scenarios and keeping a close eye on metrics like response time, throughput, and error rates, you can ensure your system performs well under load. Proactive load testing helps to deliver a smooth, reliable experience for users, even during peak times.

❌
❌