Wednesday, October 6, 2021

How to distinguish logging per Client or Request in Java? Use MDC or Mapped Diagnostic Context in Log4j Example

The MDC or Mapped Diagnostic Context is a concept or feature of the Log4j logging library which can be used to group related log messages together. For example, by using MDC you can stamp a unique identification String like clientId or orderId on each log message, and then by using grep command in Linux, you can extract all log messages for a particular client or order to understand exactly what happened to a particular order. This is especially very useful in multi-threaded, concurrent Java applications where multiple threads are simultaneously processing multiple orders from multiple clients.  In such applications, searching for relevant log messages in a big log file where log messages for multiple orders or clients are overlapping is a big task.

The MDC concept is not a new concept and available since log4j 1.2 version. Since most real-world systems like high-frequency trading application process thousands of orders simultaneously where different thread runs the same piece of code but with different data.

Each execution is unique, but without unique identification, it's difficult to find all log messages related to processing a particular order. The MDC or Mapped Diagnostic Context feature of Log4j solves that problem by stamping unique identification strings on relevant log messages.

Once you put this information on MDC at the start of your processing cycle, all log messages result with the processing of that order will be stamped with the same OrderId, but processing of different orders will be having different orderId, this way you can track complete processing of a particular order. You can even find if an order is made to your system or not.




Why use Mapped Diagnostic Context or MDC in Log4j?

If you have worked in a multithreaded high volume transaction processing system, then you have faced a situation, where you need to trace processing of a particular order, without MDC, it's nightmare. You need to include Order specific information in every single log message you print.

With MDC or ThreadContext from Log4j2 onwards, you just need to add that data one time, at the start of your processing, after that it's available in every log message.

Btw, this issue is not just for multi-threaded core Java applications, you will face similar issues on enterprise or web applications, where multiple client requests are processed simultaneously by different threads. Log messages from processing two different orders may interleave together. 

MDC allows you to stamp each log message with a unique identification number e.g. orderId, which is unique for every order, sessionId, which is unique for every session, or username, which is unique for every user.

In short, Mapped Diagnostic Context will help you in debugging, troubleshooting, and even auditing. It's also one of the logging best practices to employ on production systems.

Log4j also provides a similar utility called NDC, known as Nested Diagnostic Context, both of which are replaced by ThreadContext class in Log4j 2.

The ThreadContext class provides a Map and a Set to replace MDC and NDC. The Thread Context Map allows any number of items to be added and be identified using key/value pairs, while ThreadContextStack allows one or more items to be pushed on the Stack and then be identified by their order in the Stack or by the data itself. 

Since key/value pairs are more flexible, the Thread Context Map is recommended when data items may be added during the processing of the requestor when there are more than one or two items. 

Btw, if you are an experienced Java developer, I suggest you joining these Java performance courses or reading a good book e.g. Java Performance Definitive Guide By Scott Oaks to learn more about how logging impacts performance. It's very important for a seasoned developer to know what he is doing. 





How to use MDC to log client-specific information in log4j

You can put any information into Mapped Diagnostic Context or MDC by calling the put() method. MDC is a static class i.e. a class with just static methods, which means you can directly call them without creating any instance of MDC.

Remember, this information is stored as a thread-local variable, which may cause a memory leak in a web application if used incorrectly (see). If you are using MDC to log client or order specific information e.g. orderId in a web application using a filter then you can also remove it once done.

try{
  MDC.put("tradeId", trade.getId());
}finally{
  MDC.remove.remove("tradeId");
}


By the way from log4j2 onwards, MDC and NDC (Nested Diagnostic Context) are merged into a class called ThreadContext So, instead of using MDC, you need to use ThreadContext class to put unique context data.

try{
   ThreadContext.put("tradeId", trade.getId());
}finally {
   ThreadContext.clear();
}




Once available in MDC, this information can be printed in the log file using PatternLayout. You can use %X{tradeId) to print tradeId in log messages, remember tradeId for different trades will be different, which allows you to trace logging messages for individual trades. You can use the following Patterns to include MDC contents in log files:
  • Use %X by itself to include the full contents of the Map.
  • Use %X{key} to include the specified key.
  • Use %x to include the full contents of the Stack

Remember MDC is managed on a per-thread basis and every child thread automatically inherits a copy of the mapped diagnostic context from its parent. This is achieved by using InheritableThreadLocal class, which is a subclass of the ThreadLocal class.

This class extends ThreadLocal to provide inheritance of values from parent thread to child thread: when a child thread is created, the child receives initial values for all inheritable thread-local variables for which the parent has values.

Here is an example, where both parent and child thread is using the same value from MDC or ThreadContextMap:




Normally the child's values will be identical to the parent's; however, the child's value can be made an arbitrary function of the parent's by overriding the childValue() method


That's all guys if you have been using Log4j 1.x or Log4j2 and not using MDC, you are missing a very important feature, especially if you often need to know what happened to a particular order, request, or user. 

I have found putting information like OrderId, Username, RquestId, or something which can distinguish each request adds tremendous value during troubleshooting and debugging. It's also easier to trace orders and answer support queries. 

If you have a unique id to distinguish each message, then always include them into MDC to distinguish logging information from client or order basis.



Other Java Logging Tutorials and Best Practices you may like
  • 5 Java Performance tuning books for experienced Programmers (list)
  • Why use Log4j logging vs System.out.println in Java? (answer)
  • 40 OOP Interview questions with Answers (OOP questions)
  • Why use SLF4j over log4j for logging in Java? (answer)
  • 15 Spring Data JPA Interview Questions (Spring data jpa questions)
  • How to enable SSL debugging log in Java Virtual Machine? (tips)
  • 20 Spring Boot Testing Interview Questions (testing questions)
  • How to configure Log4j without XML or Properties file in Java? (tutorial)
  • 30 React Interview Questions with Answers (react questions)
  • Difference between Functional and Non-Functional Requirement (article)
  • 30 Python Interview Questions with Answers (python questions)
  • 30 JavaScript Interview Questions with Answers (javascript questions)
  • 10 Essential JVM Options for Real-world Java Applications (tips)

Thanks for reading this article so far. If you have liked this article then you may want to like our Facebook page (Javarevisited) as well to receive updates and information about Java and related technology. If you have any suggestions or feedback then please drop a comment.

No comments :

Post a Comment