Reading view

There are new articles available, click to refresh the page.

Learning Notes #72 – Metrics in K6 Load Testing

In our previous blog on K6, we ran a script.js to test an api. As an output we received some metrics in the cli.

In this blog we are going to delve deep in to understanding metrics in K6.

1. HTTP Request Metrics

http_reqs

  • Description: Total number of HTTP requests initiated during the test.
  • Usage: Indicates the volume of traffic generated. A high number of requests can simulate real-world usage patterns.

http_req_duration

  • Description: Time taken for a request to receive a response (in milliseconds).
  • Components:
    • http_req_connecting: Time spent establishing a TCP connection.
    • http_req_tls_handshaking: Time for completing the TLS handshake.
    • http_req_waiting (TTFB): Time spent waiting for the first byte from the server.
    • http_req_sending: Time taken to send the HTTP request.
    • http_req_receiving: Time spent receiving the response data.
  • Usage: Identifies performance bottlenecks like slow server responses or network latency.

http_req_failed

  • Description: Proportion of failed HTTP requests (ratio between 0 and 1).
  • Usage: Highlights reliability issues. A high failure rate indicates problems with server stability or network errors.

2. VU (Virtual User) Metrics

vus

  • Description: Number of active Virtual Users at any given time.
  • Usage: Reflects concurrency level. Helps analyze how the system performs under varying loads.

vus_max

  • Description: Maximum number of Virtual Users during the test.
  • Usage: Defines the peak load. Useful for stress testing and capacity planning.

3. Iteration Metrics

iterations

  • Description: Total number of script iterations executed.
  • Usage: Measures the test’s progress and workload. Useful in endurance (soak) testing to observe long-term stability.

iteration_duration

  • Description: Time taken to complete one iteration of the script.
  • Usage: Helps identify performance degradation over time, especially under sustained load.

4. Data Transfer Metrics

data_sent

  • Description: Total amount of data sent over the network (in bytes).
  • Usage: Monitors network usage. High data volumes might indicate inefficient request payloads.

data_received

  • Description: Total data received from the server (in bytes).
  • Usage: Detects bandwidth usage and helps identify heavy response payloads.

5. Custom Metrics (Optional)

While K6 provides default metrics, you can define custom metrics like Counters, Gauges, Rates, and Trends for specific business logic or technical KPIs.

Example

import { Counter } from 'k6/metrics';

let myCounter = new Counter('my_custom_metric');

export default function () {
  myCounter.add(1); // Increment the custom metric
}

Interpreting Metrics for Performance Optimization

  • Low http_req_duration + High http_reqs = Good scalability.
  • High http_req_failed = Investigate server errors or timeouts.
  • High data_sent / data_received = Optimize payloads.
  • Increasing iteration_duration over time = Possible memory leaks or resource exhaustion.

Learning Notes #69 – Getting Started with k6: Writing Your First Load Test

Performance testing is a crucial part of ensuring the stability and scalability of web applications. k6 is a modern, open-source load testing tool that allows developers and testers to script and execute performance tests efficiently. In this blog, we’ll explore the basics of k6 and write a simple test script to get started.

What is k6?

k6 is a load testing tool designed for developers. It is written in Go but uses JavaScript for scripting tests. Key features include,

  • High performance with minimal resource consumption
  • JavaScript-based scripting
  • CLI-based execution with detailed reporting
  • Integration with monitoring tools like Grafana and Prometheus

Installation

For installation check : https://grafana.com/docs/k6/latest/set-up/install-k6/

Writing a Basic k6 Test

A k6 test is written in JavaScript. Here’s a simple script to test an API endpoint,


import http from 'k6/http';
import { check, sleep } from 'k6';

export let options = {
  vus: 10, // Number of virtual users
  duration: '10s', // Test duration
};

export default function () {
  let res = http.get('https://api.restful-api.dev/objects');
  check(res, {
    'is status 200': (r) => r.status === 200,
  });
  sleep(1); // Simulate user wait time
}

Running the Test

Save the script as script.js and execute the test using the following command,

k6 run script.js

Understanding the Output

After running the test, k6 will provide a summary including

1. HTTP requests: Total number of requests made during the test.

    2. Response time metrics:

    • min: The shortest response time recorded.
    • max: The longest response time recorded.
    • avg: The average response time of all requests.
    • p(90), p(95), p(99): Percentile values indicating response time distribution.

    3. Checks: Number of checks passed or failed, such as status code validation.

    4. Virtual users (VUs):

    • vus_max: The maximum number of virtual users active at any time.
    • vus: The current number of active virtual users.

    5. Request Rate (RPS – Requests Per Second): The number of requests handled per second.

    6. Failures: Number of errors or failed requests due to timeouts or HTTP status codes other than expected.

    Next Steps

    Once you’ve successfully run your first k6 test, you can explore,

    • Load testing different APIs and endpoints
    • Running distributed tests
    • Exporting results to Grafana
    • Integrating k6 with CI/CD pipelines

    k6 is a powerful tool that helps developers and QA engineers ensure their applications perform under load. Stay tuned for more in-depth tutorials on advanced k6 features!

    Learning Notes #65 – Application Logs, Metrics, MDC

    I am big fan of logs. Would like to log everything. All the request, response of an API. But is it correct ? Though logs helped our team greatly during this new year, i want to know, is there a better approach to log things. That search made this blog. In this blog i jot down notes on logging. Lets log it.

    Throughout this blog, i try to generalize things. Not biased to a particular language. But here and there you can see me biased towards Python. Also this is my opinion. Not a hard rule.

    Which is a best logger ?

    I’m not here to argue about which logger is the best, they all have their problems. But the worst one is usually the one you build yourself. Sure, existing loggers aren’t perfect, but trying to create your own is often a much bigger mistake.

    1. Why Logging Matters

    Logging provides visibility into your application’s behavior, helping to,

    • Diagnose and troubleshoot issues (This is most common usecase)
    • Monitor application health and performance (Metrics)
    • Meet compliance and auditing requirements (Audit Logs)
    • Enable debugging in production environments (we all do this.)

    However, poorly designed logging strategies can lead to excessive log volumes, higher costs, and difficulty in pinpointing actionable insights.

    2. Logging Best Practices

    a. Use Structured Logs

    Long story short, instead of unstructured plain text, use JSON or other structured formats. This makes parsing and querying easier, especially in log aggregation tools.

    
    {
      "timestamp": "2025-01-20T12:34:56Z",
      "level": "INFO",
      "message": "User login successful",
      "userId": 12345,
      "sessionId": "abcde12345"
    }
    

    b. Leverage Logging Levels

    Define and adhere to appropriate logging levels to avoid log bloat:

    • DEBUG: Detailed information for debugging.
    • INFO: General operational messages.
    • WARNING: Indications of potential issues.
    • ERROR: Application errors that require immediate attention.
    • CRITICAL: Severe errors leading to application failure.

    c. Avoid Sensitive Data

    Sanitize your logs to exclude sensitive information like passwords, PII, or API keys. Instead, mask or hash such data. Don’t add token even for testing.


    d. Include Contextual Information

    Incorporate metadata like request IDs, user IDs, or transaction IDs to trace specific events effectively.


    3. Log Ingestion at Scale

    As applications scale, log ingestion can become a bottleneck. Here’s how to manage it,

    a. Centralized Logging

    Stream logs to centralized systems like Elasticsearch, Logstash, Kibana (ELK), or cloud-native services like AWS CloudWatch, Azure Monitor, or Google Cloud Logging.

    b. Optimize Log Volume

    • Log only necessary information.
    • Use log sampling to reduce verbosity in high-throughput systems.
    • Rotate logs to limit disk usage.

    c. Use Asynchronous Logging

    Asynchronous loggers improve application performance by delegating logging tasks to separate threads or processes. (Not Suitable all time. It has its own problems)

    d. Method return values are usually important

    If you have a log in the method and don’t include the return value of the method, you’re missing important information. Make an effort to include that at the expense of slightly less elegant looking code.

    e. Include filename in error messages

    Mention the path/to/file:line-number to pinpoint the location of the issue.

    3. Logging Don’ts

    a. Don’t Log Everything at the Same Level

    Logging all messages at the INFO or DEBUG level creates noise and makes it difficult to identify critical issues.

    b. Don’t Hardcode Log Messages

    Avoid static, vague, or generic log messages. Use dynamic and descriptive messages that include relevant context.

    # Bad Example
    Error occurred.
    
    # Good Example
    Error occurred while processing payment for user_id=12345, transaction_id=abc-6789.
    

    c. Don’t Log Sensitive or Regulated Data

    Exposing personally identifiable information (PII), passwords, or other sensitive data in logs can lead to compliance violations (e.g., GDPR, HIPAA).

    d. Don’t Ignore Log Rotation

    Failing to implement log rotation can result in disk space exhaustion, especially in high traffic systems (Log Retention).

    e. Don’t Overlook Log Correlation

    Logs without request IDs, session IDs, or contextual metadata make it difficult to correlate related events.

    f. Don’t Forget to Monitor Log Costs

    Logging everything without considering storage and processing costs can lead to financial inefficiency in large-scale systems.

    g. Keep the log message short

    Long and verbose messages are a cost. The cost is in reading time and ingestion time.

    h. Never use log message in loop

    This might seem obvious, but just to be clear -> logging inside a loop, even if the log level isn’t visible by default, can still hurt performance. It’s best to avoid this whenever possible.

    If you absolutely need to log something at a hidden level and decide to break this guideline, keep it short and straightforward.

    i. Log item you already “have”

    We should avoid this,

    
    logger.info("Reached X and value of method is {}", method());
    

    Here, just for the logging purpose, we are calling the method() again. Even if the method is cheap. You’re effectively running the method regardless of the respective logging levels!

    j. Dont log iterables

    Even if it’s a small list. The concern is that the list might grow and “overcrowd” the log. Writing the content of the list to the log can balloon it up and slow processing noticeably. Also kills time in debugging.

    k. Don’t Log What the Framework Logs for You

    There are great things to log. E.g. the name of the current thread, the time, etc. But those are already written into the log by default almost everywhere. Don’t duplicate these efforts.

    l.Don’t log Method Entry/Exit

    Log only important events in the system. Entering or exiting a method isn’t an important event. E.g. if I have a method that enables feature X the log should be “Feature X enabled” and not “enable_feature_X entered”. I have done this a lot.

    m. Dont fill the method

    A complex method might include multiple points of failure, so it makes sense that we’d place logs in multiple points in the method so we can detect the failure along the way. Unfortunately, this leads to duplicate logging and verbosity.

    Errors will typically map to error handling code which should be logged in generically. So all error conditions should already be covered.

    This creates situations where we sometimes need to change the flow/behavior of the code, so logging will be more elegant.

    n. Don’t use AOP logging

    AOP (Aspect-Oriented Programming) logging allows you to automatically add logs at specific points in your application, such as when methods are entered or exited.

    In Python, AOP-style logging can be implemented using decorators or middleware that inject logs into specific points, such as method entry and exit. While it might seem appealing for detailed tracing, the same problems apply as in other languages like Java.

    
    import logging
    
    logging.basicConfig(level=logging.INFO)
    logger = logging.getLogger(__name__)
    
    def log_method_entry_exit(func):
        def wrapper(*args, **kwargs):
            logger.info(f"Entering: {func.__name__} with args={args} kwargs={kwargs}")
            result = func(*args, **kwargs)
            logger.info(f"Exiting: {func.__name__} with result={result}")
            return result
        return wrapper
    
    # Example usage
    @log_method_entry_exit
    def example_function(x, y):
        return x + y
    
    example_function(5, 3)
    
    

    Why Avoid AOP Logging in Python

    1. Performance Impact:
      • Injecting logs into every method increases runtime overhead, especially if used extensively in large-scale systems.
      • In Python, where function calls already add some overhead, this can significantly affect performance.
    2. Log Verbosity:
      • If this decorator is applied to every function or method in a system, it produces an enormous amount of log data.
      • Debugging becomes harder because the meaningful logs are lost in the noise of entry/exit logs.
    3. Limited Usefulness:
      • During local development, tools like Python debuggers (pdb), profilers (cProfile, line_profiler), or tracing libraries like trace are far more effective for inspecting function behavior and performance.
    4. CI Issues:
      • Enabling such verbose logging during CI test runs can make tracking test failures more difficult because the logs are flooded with entry/exit messages, obscuring the root cause of failures.

    Use Python-specific tools like pdb, ipdb, or IDE-integrated debuggers to inspect code locally.

    o. Dont Double log

    It’s pretty common to log an error when we’re about to throw an error. However, since most error code is generic, it’s likely there’s a log in the generic error handling code.

    4. Ensuring Scalability

    To keep your logging system robust and scalable,

    • Monitor Log Storage: Set alerts for log storage thresholds.
    • Implement Compression: Compress log files to reduce storage costs.
    • Automate Archival and Deletion: Regularly archive old logs and purge obsolete data.
    • Benchmark Logging Overhead: Measure the performance impact of logging on your application.

    5. Logging for Metrics

    Below, is the list of items that i wish can be logged for metrics.

    General API Metrics

    1. General API Metrics on HTTP methods, status codes, latency/duration, request size.
    2. Total requests per endpoint over time. Requests per minute/hour.
    3. Frequency and breakdown of 4XX and 5XX errors.
    4. User ID or API client making the request.
    
    {
      "timestamp": "2025-01-20T12:34:56Z",
      "endpoint": "/projects",
      "method": "POST",
      "status_code": 201,
      "user_id": 12345,
      "request_size_bytes": 512,
      "response_size_bytes": 256,
      "duration_ms": 120
    }
    

    Business Specific Metrics

    1. Objects (session) creations: No. of projects created (daily/weekly)
    2. Average success/failure rate.
    3. Average time to create a session.
    4. Frequency of each action on top of session.
    
    {
      "timestamp": "2025-01-20T12:35:00Z",
      "endpoint": "/projects/12345/actions",
      "action": "edit",
      "status_code": 200,
      "user_id": 12345,
      "duration_ms": 98
    }
    

    Performance Metrics

    1. Database query metrics on execution time, no. of queries per request.
    2. Third party service metrics on time spent, success/failure rates of external calls.
    
    {
      "timestamp": "2025-01-20T12:37:15Z",
      "endpoint": "/projects/12345",
      "db_query_time_ms": 45,
      "external_api_time_ms": 80,
      "status_code": 200,
      "duration_ms": 130
    }
    
    

    Scalability Metrics

    1. Concurrency metrics on max request handled.
    2. Request queue times during load.
    3. System Metrics on CPU and Memory usage during request processing (this will be auto captured).

    Usage Metrics

    1. Traffic analysis on peak usage times.
    2. Most/Least used endpoints.

    6. Mapped Diagnostic Context (MDC)

    MDC is the one, i longed for most. Also went into trouble by implementing without a middleware.

    Mapped Diagnostic Context (MDC) is a feature provided by many logging frameworks, such as Logback, Log4j, and SLF4J. It allows developers to attach contextual information (key-value pairs) to the logging events, which can then be automatically included in log messages.

    This context helps in differentiating and correlating log messages, especially in multi-threaded applications.

    Why Use MDC?

    1. Enhanced Log Clarity: By adding contextual information like user IDs, session IDs, or transaction IDs, MDC enables logs to provide more meaningful insights.
    2. Easier Debugging: When logs contain thread-specific context, tracing the execution path of a specific transaction or user request becomes straightforward.
    3. Reduced Log Ambiguity: MDC ensures that logs from different threads or components do not get mixed up, avoiding confusion.

    Common Use Cases

    1. Web Applications: Logging user sessions, request IDs, or IP addresses to trace the lifecycle of a request.
    2. Microservices: Propagating correlation IDs across services for distributed tracing.
    3. Background Tasks: Tracking specific jobs or tasks in asynchronous operations.

    Limitations (Curated from other blogs. I havent tried yet )

    1. Thread Boundaries: MDC is thread-local, so its context does not automatically propagate across threads (e.g., in asynchronous executions). For such scenarios, you may need to manually propagate the MDC context.
    2. Overhead: Adding and managing MDC context introduces a small runtime overhead, especially in high-throughput systems.
    3. Configuration Dependency: Proper MDC usage often depends on correctly configuring the logging framework.

    
    2025-01-21 14:22:15.123 INFO  [thread-1] [userId=12345, transactionId=abc123] Starting transaction
    2025-01-21 14:22:16.456 DEBUG [thread-1] [userId=12345, transactionId=abc123] Processing request
    2025-01-21 14:22:17.789 ERROR [thread-1] [userId=12345, transactionId=abc123] Error processing request: Invalid input
    2025-01-21 14:22:18.012 INFO  [thread-1] [userId=12345, transactionId=abc123] Transaction completed
    
    

    In Fastapi, we can implement this via a middleware,

    
    import logging
    import uuid
    from fastapi import FastAPI, Request
    from starlette.middleware.base import BaseHTTPMiddleware
    
    # Configure the logger
    logger = logging.getLogger("uvicorn")
    logger.setLevel(logging.INFO)
    
    # Create a custom formatter with MDC placeholders
    class CustomFormatter(logging.Formatter):
        def format(self, record):
            record.user_id = getattr(record, "user_id", "unknown")
            record.transaction_id = getattr(record, "transaction_id", str(uuid.uuid4()))
            return super().format(record)
    
    # Set the logging format with MDC keys
    formatter = CustomFormatter(
        "%(asctime)s %(levelname)s [%(threadName)s] [userId=%(user_id)s, transactionId=%(transaction_id)s] %(message)s"
    )
    
    # Apply the formatter to the handler
    console_handler = logging.StreamHandler()
    console_handler.setFormatter(formatter)
    logger.addHandler(console_handler)
    
    # FastAPI application
    app = FastAPI()
    
    # Custom Middleware to add MDC context
    class RequestContextMiddleware(BaseHTTPMiddleware):
        async def dispatch(self, request: Request, call_next):
            # Add MDC info before handling the request
            user_id = request.headers.get("X-User-ID", "default-user")
            transaction_id = str(uuid.uuid4())
            logging.getLogger().info(f"Request started: {user_id}, {transaction_id}")
    
            # Add MDC info to log
            logging.getLogger().user_id = user_id
            logging.getLogger().transaction_id = transaction_id
    
            response = await call_next(request)
    
            # Optionally, log additional information when the response is done
            logging.getLogger().info(f"Request finished: {user_id}, {transaction_id}")
    
            return response
    
    # Add custom middleware to the FastAPI app
    app.add_middleware(RequestContextMiddleware)
    
    @app.get("/")
    async def read_root():
        logger.info("Handling the root endpoint.")
        return {"message": "Hello, World!"}
    
    @app.get("/items/{item_id}")
    async def read_item(item_id: int):
        logger.info(f"Fetching item with ID {item_id}")
        return {"item_id": item_id}
    
    

    Hope, you might have got a better idea on logging.

    Locust EP 1 : Load Testing: Ensuring Application Reliability with Real-Time Examples and Metrics

    In today’s fast-paced digital application, delivering a reliable and scalable application is key to providing a positive user experience.

    One of the most effective ways to guarantee this is through load testing. This post will walk you through the fundamentals of load testing, real-time examples of its application, and crucial metrics to watch for.

    What is Load Testing?

    Load testing is a type of performance testing that simulates real-world usage of an application. By applying load to a system, testers observe how it behaves under peak and normal conditions. The primary goal is to identify any performance bottlenecks, ensure the system can handle expected user traffic, and maintain optimal performance.

    Load testing answers these critical questions:

    • Can the application handle the expected user load?
    • How does performance degrade as the load increases?
    • What is the system’s breaking point?

    Why is Load Testing Important?

    Without load testing, applications are vulnerable to crashes, slow response times, and unavailability, all of which can lead to a poor user experience, lost revenue, and brand damage. Proactive load testing allows teams to address issues before they impact end-users.

    Real-Time Load Testing Examples

    Let’s explore some real-world examples that demonstrate the importance of load testing.

    Example 1: E-commerce Website During a Sale Event

    An online retailer preparing for a Black Friday sale knows that traffic will spike. They conduct load testing to simulate thousands of users browsing, adding items to their cart, and checking out simultaneously. By analyzing the system’s response under these conditions, the retailer can identify weak points in the checkout process or database and make necessary optimizations.

    Example 2: Video Streaming Platform Launch

    A new streaming platform is preparing for launch, expecting millions of users. Through load testing, the team simulates high traffic, testing how well video streaming performs under maximum user load. This testing also helps check if CDN (Content Delivery Network) configurations are optimized for global access, ensuring minimal buffering and downtime during peak hours.

    Example 3: Financial Services Platform During Market Hours

    A trading platform experiences intense usage during market open and close hours. Load testing helps simulate these peak times, ensuring that real-time data updates, transactions, and account management work flawlessly. Testing for these scenarios helps avoid issues like slow trade executions and platform unavailability during critical trading periods.

    Key Metrics to Monitor in Load Testing

    Understanding key metrics is essential for interpreting load test results. Here are some critical metrics to focus on:

    1. Response Time

    • Definition: The time taken by the system to respond to a request.
    • Why It Matters: Slow response times can frustrate users and indicate bottlenecks.
    • Example Thresholds: For websites, a response time below 2 seconds is considered acceptable.

    2. Throughput

    • Definition: The number of requests processed per second.
    • Why It Matters: Throughput indicates how many concurrent users your application can handle.
    • Real-Time Use Case: In our e-commerce example, the retailer would track throughput to ensure the checkout process doesn’t become a bottleneck.

    3. Error Rate

    • Definition: The percentage of failed requests out of total requests.
    • Why It Matters: A high error rate could indicate application instability under load.
    • Real-Time Use Case: The trading platform monitors the error rate during market close, ensuring the system doesn’t throw errors under peak trading load.

    4. CPU and Memory Utilization

    • Definition: The percentage of CPU and memory resources used during the load test.
    • Why It Matters: High CPU or memory utilization can signal that the server may not handle additional load.
    • Real-Time Use Case: The video streaming platform tracks memory usage to prevent lag or interruptions in streaming as users increase.

    5. Concurrent Users

    • Definition: The number of users active on the application at the same time.
    • Why It Matters: Concurrent users help you understand how much load the system can handle before performance starts degrading.
    • Real-Time Use Case: The retailer tests how many concurrent users can shop simultaneously without crashing the website.

    6. Latency

    • Definition: The time it takes for a request to travel from the client to the server and back.
    • Why It Matters: High latency indicates network or processing delays that can slow down the user experience.
    • Real-Time Use Case: For a financial app, reducing latency ensures trades execute in near real-time, which is crucial for users during volatile market conditions.

    7. 95th and 99th Percentile Response Times

    • Definition: The time within which 95% or 99% of requests are completed.
    • Why It Matters: These percentiles help identify outliers that may impact user experience.
    • Real-Time Use Case: The streaming service may analyze these percentiles to ensure smooth playback for most users, even under peak loads.

    Best Practices for Effective Load Testing

    1. Set Clear Objectives: Define specific goals, such as the expected number of concurrent users or acceptable response times, based on the nature of the application.
    2. Use Realistic Load Scenarios: Create scenarios that mimic actual user behavior, including peak times, user interactions, and geographical diversity.
    3. Analyze Bottlenecks and Optimize: Use test results to identify and address performance bottlenecks, whether in the application code, database queries, or server configurations.
    4. Monitor in Real-Time: Track metrics like response time, throughput, and error rates in real-time to identify issues as they arise during the test.
    5. Repeat and Compare: Conduct multiple load tests to ensure consistent performance over time, especially after any significant update or release.

    Load testing is crucial for building a resilient and scalable application. By using real-world scenarios and keeping a close eye on metrics like response time, throughput, and error rates, you can ensure your system performs well under load. Proactive load testing helps to deliver a smooth, reliable experience for users, even during peak times.

    ❌