Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Learning Notes #65 – Application Logs, Metrics, MDC

21 January 2025 at 05:45

I am big fan of logs. Would like to log everything. All the request, response of an API. But is it correct ? Though logs helped our team greatly during this new year, i want to know, is there a better approach to log things. That search made this blog. In this blog i jot down notes on logging. Lets log it.

Throughout this blog, i try to generalize things. Not biased to a particular language. But here and there you can see me biased towards Python. Also this is my opinion. Not a hard rule.

Which is a best logger ?

I’m not here to argue about which logger is the best, they all have their problems. But the worst one is usually the one you build yourself. Sure, existing loggers aren’t perfect, but trying to create your own is often a much bigger mistake.

1. Why Logging Matters

Logging provides visibility into your application’s behavior, helping to,

  • Diagnose and troubleshoot issues (This is most common usecase)
  • Monitor application health and performance (Metrics)
  • Meet compliance and auditing requirements (Audit Logs)
  • Enable debugging in production environments (we all do this.)

However, poorly designed logging strategies can lead to excessive log volumes, higher costs, and difficulty in pinpointing actionable insights.

2. Logging Best Practices

a. Use Structured Logs

Long story short, instead of unstructured plain text, use JSON or other structured formats. This makes parsing and querying easier, especially in log aggregation tools.


{
  "timestamp": "2025-01-20T12:34:56Z",
  "level": "INFO",
  "message": "User login successful",
  "userId": 12345,
  "sessionId": "abcde12345"
}

b. Leverage Logging Levels

Define and adhere to appropriate logging levels to avoid log bloat:

  • DEBUG: Detailed information for debugging.
  • INFO: General operational messages.
  • WARNING: Indications of potential issues.
  • ERROR: Application errors that require immediate attention.
  • CRITICAL: Severe errors leading to application failure.

c. Avoid Sensitive Data

Sanitize your logs to exclude sensitive information like passwords, PII, or API keys. Instead, mask or hash such data. Don’t add token even for testing.


d. Include Contextual Information

Incorporate metadata like request IDs, user IDs, or transaction IDs to trace specific events effectively.


3. Log Ingestion at Scale

As applications scale, log ingestion can become a bottleneck. Here’s how to manage it,

a. Centralized Logging

Stream logs to centralized systems like Elasticsearch, Logstash, Kibana (ELK), or cloud-native services like AWS CloudWatch, Azure Monitor, or Google Cloud Logging.

b. Optimize Log Volume

  • Log only necessary information.
  • Use log sampling to reduce verbosity in high-throughput systems.
  • Rotate logs to limit disk usage.

c. Use Asynchronous Logging

Asynchronous loggers improve application performance by delegating logging tasks to separate threads or processes. (Not Suitable all time. It has its own problems)

d. Method return values are usually important

If you have a log in the method and don’t include the return value of the method, you’re missing important information. Make an effort to include that at the expense of slightly less elegant looking code.

e. Include filename in error messages

Mention the path/to/file:line-number to pinpoint the location of the issue.

3. Logging Don’ts

a. Don’t Log Everything at the Same Level

Logging all messages at the INFO or DEBUG level creates noise and makes it difficult to identify critical issues.

b. Don’t Hardcode Log Messages

Avoid static, vague, or generic log messages. Use dynamic and descriptive messages that include relevant context.

# Bad Example
Error occurred.

# Good Example
Error occurred while processing payment for user_id=12345, transaction_id=abc-6789.

c. Don’t Log Sensitive or Regulated Data

Exposing personally identifiable information (PII), passwords, or other sensitive data in logs can lead to compliance violations (e.g., GDPR, HIPAA).

d. Don’t Ignore Log Rotation

Failing to implement log rotation can result in disk space exhaustion, especially in high traffic systems (Log Retention).

e. Don’t Overlook Log Correlation

Logs without request IDs, session IDs, or contextual metadata make it difficult to correlate related events.

f. Don’t Forget to Monitor Log Costs

Logging everything without considering storage and processing costs can lead to financial inefficiency in large-scale systems.

g. Keep the log message short

Long and verbose messages are a cost. The cost is in reading time and ingestion time.

h. Never use log message in loop

This might seem obvious, but just to be clear -> logging inside a loop, even if the log level isn’t visible by default, can still hurt performance. It’s best to avoid this whenever possible.

If you absolutely need to log something at a hidden level and decide to break this guideline, keep it short and straightforward.

i. Log item you already “have”

We should avoid this,


logger.info("Reached X and value of method is {}", method());

Here, just for the logging purpose, we are calling the method() again. Even if the method is cheap. You’re effectively running the method regardless of the respective logging levels!

j. Dont log iterables

Even if it’s a small list. The concern is that the list might grow and “overcrowd” the log. Writing the content of the list to the log can balloon it up and slow processing noticeably. Also kills time in debugging.

k. Don’t Log What the Framework Logs for You

There are great things to log. E.g. the name of the current thread, the time, etc. But those are already written into the log by default almost everywhere. Don’t duplicate these efforts.

l.Don’t log Method Entry/Exit

Log only important events in the system. Entering or exiting a method isn’t an important event. E.g. if I have a method that enables feature X the log should be “Feature X enabled” and not “enable_feature_X entered”. I have done this a lot.

m. Dont fill the method

A complex method might include multiple points of failure, so it makes sense that we’d place logs in multiple points in the method so we can detect the failure along the way. Unfortunately, this leads to duplicate logging and verbosity.

Errors will typically map to error handling code which should be logged in generically. So all error conditions should already be covered.

This creates situations where we sometimes need to change the flow/behavior of the code, so logging will be more elegant.

n. Don’t use AOP logging

AOP (Aspect-Oriented Programming) logging allows you to automatically add logs at specific points in your application, such as when methods are entered or exited.

In Python, AOP-style logging can be implemented using decorators or middleware that inject logs into specific points, such as method entry and exit. While it might seem appealing for detailed tracing, the same problems apply as in other languages like Java.


import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def log_method_entry_exit(func):
    def wrapper(*args, **kwargs):
        logger.info(f"Entering: {func.__name__} with args={args} kwargs={kwargs}")
        result = func(*args, **kwargs)
        logger.info(f"Exiting: {func.__name__} with result={result}")
        return result
    return wrapper

# Example usage
@log_method_entry_exit
def example_function(x, y):
    return x + y

example_function(5, 3)

Why Avoid AOP Logging in Python

  1. Performance Impact:
    • Injecting logs into every method increases runtime overhead, especially if used extensively in large-scale systems.
    • In Python, where function calls already add some overhead, this can significantly affect performance.
  2. Log Verbosity:
    • If this decorator is applied to every function or method in a system, it produces an enormous amount of log data.
    • Debugging becomes harder because the meaningful logs are lost in the noise of entry/exit logs.
  3. Limited Usefulness:
    • During local development, tools like Python debuggers (pdb), profilers (cProfile, line_profiler), or tracing libraries like trace are far more effective for inspecting function behavior and performance.
  4. CI Issues:
    • Enabling such verbose logging during CI test runs can make tracking test failures more difficult because the logs are flooded with entry/exit messages, obscuring the root cause of failures.

Use Python-specific tools like pdb, ipdb, or IDE-integrated debuggers to inspect code locally.

o. Dont Double log

It’s pretty common to log an error when we’re about to throw an error. However, since most error code is generic, it’s likely there’s a log in the generic error handling code.

4. Ensuring Scalability

To keep your logging system robust and scalable,

  • Monitor Log Storage: Set alerts for log storage thresholds.
  • Implement Compression: Compress log files to reduce storage costs.
  • Automate Archival and Deletion: Regularly archive old logs and purge obsolete data.
  • Benchmark Logging Overhead: Measure the performance impact of logging on your application.

5. Logging for Metrics

Below, is the list of items that i wish can be logged for metrics.

General API Metrics

  1. General API Metrics on HTTP methods, status codes, latency/duration, request size.
  2. Total requests per endpoint over time. Requests per minute/hour.
  3. Frequency and breakdown of 4XX and 5XX errors.
  4. User ID or API client making the request.

{
  "timestamp": "2025-01-20T12:34:56Z",
  "endpoint": "/projects",
  "method": "POST",
  "status_code": 201,
  "user_id": 12345,
  "request_size_bytes": 512,
  "response_size_bytes": 256,
  "duration_ms": 120
}

Business Specific Metrics

  1. Objects (session) creations: No. of projects created (daily/weekly)
  2. Average success/failure rate.
  3. Average time to create a session.
  4. Frequency of each action on top of session.

{
  "timestamp": "2025-01-20T12:35:00Z",
  "endpoint": "/projects/12345/actions",
  "action": "edit",
  "status_code": 200,
  "user_id": 12345,
  "duration_ms": 98
}

Performance Metrics

  1. Database query metrics on execution time, no. of queries per request.
  2. Third party service metrics on time spent, success/failure rates of external calls.

{
  "timestamp": "2025-01-20T12:37:15Z",
  "endpoint": "/projects/12345",
  "db_query_time_ms": 45,
  "external_api_time_ms": 80,
  "status_code": 200,
  "duration_ms": 130
}

Scalability Metrics

  1. Concurrency metrics on max request handled.
  2. Request queue times during load.
  3. System Metrics on CPU and Memory usage during request processing (this will be auto captured).

Usage Metrics

  1. Traffic analysis on peak usage times.
  2. Most/Least used endpoints.

6. Mapped Diagnostic Context (MDC)

MDC is the one, i longed for most. Also went into trouble by implementing without a middleware.

Mapped Diagnostic Context (MDC) is a feature provided by many logging frameworks, such as Logback, Log4j, and SLF4J. It allows developers to attach contextual information (key-value pairs) to the logging events, which can then be automatically included in log messages.

This context helps in differentiating and correlating log messages, especially in multi-threaded applications.

Why Use MDC?

  1. Enhanced Log Clarity: By adding contextual information like user IDs, session IDs, or transaction IDs, MDC enables logs to provide more meaningful insights.
  2. Easier Debugging: When logs contain thread-specific context, tracing the execution path of a specific transaction or user request becomes straightforward.
  3. Reduced Log Ambiguity: MDC ensures that logs from different threads or components do not get mixed up, avoiding confusion.

Common Use Cases

  1. Web Applications: Logging user sessions, request IDs, or IP addresses to trace the lifecycle of a request.
  2. Microservices: Propagating correlation IDs across services for distributed tracing.
  3. Background Tasks: Tracking specific jobs or tasks in asynchronous operations.

Limitations (Curated from other blogs. I havent tried yet )

  1. Thread Boundaries: MDC is thread-local, so its context does not automatically propagate across threads (e.g., in asynchronous executions). For such scenarios, you may need to manually propagate the MDC context.
  2. Overhead: Adding and managing MDC context introduces a small runtime overhead, especially in high-throughput systems.
  3. Configuration Dependency: Proper MDC usage often depends on correctly configuring the logging framework.


2025-01-21 14:22:15.123 INFO  [thread-1] [userId=12345, transactionId=abc123] Starting transaction
2025-01-21 14:22:16.456 DEBUG [thread-1] [userId=12345, transactionId=abc123] Processing request
2025-01-21 14:22:17.789 ERROR [thread-1] [userId=12345, transactionId=abc123] Error processing request: Invalid input
2025-01-21 14:22:18.012 INFO  [thread-1] [userId=12345, transactionId=abc123] Transaction completed

In Fastapi, we can implement this via a middleware,


import logging
import uuid
from fastapi import FastAPI, Request
from starlette.middleware.base import BaseHTTPMiddleware

# Configure the logger
logger = logging.getLogger("uvicorn")
logger.setLevel(logging.INFO)

# Create a custom formatter with MDC placeholders
class CustomFormatter(logging.Formatter):
    def format(self, record):
        record.user_id = getattr(record, "user_id", "unknown")
        record.transaction_id = getattr(record, "transaction_id", str(uuid.uuid4()))
        return super().format(record)

# Set the logging format with MDC keys
formatter = CustomFormatter(
    "%(asctime)s %(levelname)s [%(threadName)s] [userId=%(user_id)s, transactionId=%(transaction_id)s] %(message)s"
)

# Apply the formatter to the handler
console_handler = logging.StreamHandler()
console_handler.setFormatter(formatter)
logger.addHandler(console_handler)

# FastAPI application
app = FastAPI()

# Custom Middleware to add MDC context
class RequestContextMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        # Add MDC info before handling the request
        user_id = request.headers.get("X-User-ID", "default-user")
        transaction_id = str(uuid.uuid4())
        logging.getLogger().info(f"Request started: {user_id}, {transaction_id}")

        # Add MDC info to log
        logging.getLogger().user_id = user_id
        logging.getLogger().transaction_id = transaction_id

        response = await call_next(request)

        # Optionally, log additional information when the response is done
        logging.getLogger().info(f"Request finished: {user_id}, {transaction_id}")

        return response

# Add custom middleware to the FastAPI app
app.add_middleware(RequestContextMiddleware)

@app.get("/")
async def read_root():
    logger.info("Handling the root endpoint.")
    return {"message": "Hello, World!"}

@app.get("/items/{item_id}")
async def read_item(item_id: int):
    logger.info(f"Fetching item with ID {item_id}")
    return {"item_id": item_id}

Hope, you might have got a better idea on logging.

Docker Ep 8: Tomcat: Exploring Docker Port Mapping and Logs

15 August 2024 at 09:32

Caution: We are just starting from basics. Even if you don’t understand below concepts there is no problem.

In our previous adventures, we’ve dabbled in Docker’s mysterious arts, running containers, inspecting them, and even detaching them to roam in the background.

Today, we’re stepping up our game by diving into Docker port mapping and the powerful docker logs command. And what better companion for this journey than the Tomcat image, a trusty open-source web server that brings Java servlets to life?

Summoning the Tomcat Image

To begin, we need to summon our new ally: the Tomcat image. Tomcat is a legendary web server that, by default, operates on port 8080 within its container. But what if we want to make this web server accessible to the outside world through a different port? This is where Docker’s port mapping comes into play.

First, let’s visit the Docker Hub and search for the Tomcat image. Once there, we can see that Tomcat will run on port 8080 by default. We’ll need to expose this port and map it to a port on our host machine using the -p option.

Port Mapping: Connecting the Container to the World

Docker port mapping allows us to make the services inside our container accessible from the outside world by forwarding a host port to a container port. The syntax for port mapping is:

-p <host_port>:<container_port>

For example, let’s say we want to map port 8080 inside the container to port 8888 on our host machine. This means that when we access port 8888 on our host, we’ll actually be talking to port 8080 inside the Tomcat container.

Let’s see this in action.

Running Tomcat with Port Mapping

Fire up your Docker terminal, make sure the font size is nice and large, and enter the following command:

docker run -it -p 8888:8080 tomcat:8.0

Here’s what’s happening:

  • -it runs the container interactively.
  • -p 8888:8080 maps port 8080 inside the container to port 8888 on our host.
  • tomcat:8.0 specifies the Tomcat image and its version.

Now, sit back and relax (or maybe grab a coffee) as Docker pulls down the Tomcat image. It’s about 300 MB, so depending on your internet connection, this might take a moment.

Once the image is downloaded, Docker will spin up the container, and our Tomcat server will be ready to roll.

Accessing the Tomcat Server

With the Tomcat container up and running, we can access the Tomcat web server through our web browser. But first, we need to know where to find it.

If you’re running Docker on Linux, or using Docker for Mac or Windows, the host IP is simply localhost. If you’re using Docker Machine, you’ll need to grab the IP address of the Docker Machine.

Now, open up your browser, type in localhost:8888 (or your Docker Machine’s IP address followed by :8888), and hit Enter.

The Tomcat console page should greet you warmly.

Running Tomcat in Detached Mode

While running a container in the foreground is great for testing, in production, we typically want our containers to run in the background. For this, we use the -d flag to run the container in detached mode.

Let’s modify our previous command to run Tomcat in the background:

docker run -d -p 8888:8080 tomcat:8.0

After hitting Enter, Docker will return a long container ID, confirming that Tomcat is now running in the background, silently serving up Java servlets.

Checking Container Status with docker ps -a

Curious about the status of your Tomcat container? Use the docker ps -a command to list all the containers on your system, including those that have exited. This is a handy way to check if your container is still running or if it has quietly exited.

docker ps -a

Reading the Tea Leaves: docker logs

Sometimes, you want to peek into the inner workings of your running containers to see what they’re up to. This is where the docker logs command comes in. It lets you view the logs generated by your container, which can be incredibly useful for debugging or just keeping an eye on things.

Let’s say you want to check the logs of your running Tomcat container. You would run:

docker logs <container_id>

There is more on docker logs, we shall see later.

Kate – unsung excellent editor by KDE

19 May 2024 at 03:22

Nowadays, eventhough I am becoming an avid fan of Emacs editor, sometime I use other editors.

Kate is one of the nice editor, I am using for many years, as KDE is my desktop.

Kate is a simple GUI based text editor, which has very good features. explore its features here – https://kate-editor.org/about-kate/

Few days back, I was explore few huge airflow log files, to find some specific errors. KDE opened the log file automatically in Kate editor.

For my surprise, it opened the huge files with highlighting the error lines.

4C4mIX3.jpeg

As I am using CLI editors on local and remote machines most of the times, I have to go through all the lines always. Eyes are trained to skim quickly and find the errors patterns easily. But at this time, the Kate editor, gave color different for all the error messages. The right side small preview window helped to goto next error sections easily.

These may be the default features of other IDEs or modern editors. But, these are new tiny happy inventions for me.

Thanks KDE team and Kate developers for making our life easier and bringing happiness to all.

What are the other open source editors have these features? What is the editor you are using? Share your thoughts in your blog and reply here with the URL.

❌
❌