Be the first user to complete this post

  • 0
Add to List
Medium

383. Print Processes and its messages from a given log file

Problem Statement: 

There is an application that has several processes running in parallel and each process prints the logs in a single log file. Log message contains the process name, timestamp, message type (WARN, ERROR, etc) and actual message. You have given start timestamp and end timestamp. Write an algorithm to process the log file and combine messages as per the message types within the start and end timestamp and also find Top 2 Services with the most number of messages

 Example:

Sample Logfile:

Timestamp|Process|MessageType|Message
1540403503|S1|WARN|warning message
1540403503|S1|WARN|warning message
1540403504|S2|WARN|warning message
1540403504|S1|ERROR|error message
1540403604|S4|ERROR|error message
1540403614|S1|DEBUG|debug message
1540403614|S5|DEBUG|debug message
1540403615|S6|INFO|info message
1540403615|S6|DEBUG|debug message
1540403615|S6|DEBUG|debug message
1540403715|S7|INFO|info message
1540403715|S7|INFO|info message

Output:
S1 - 1 ERROR / 1 DEBUG / 2 WARN /
S2 - 1 WARN /
S4 - 1 ERROR /
S5 - 1 DEBUG /
S6 - 1 INFO / 2 DEBUG /
S7 - 2 INFO /
Top 2 processes are S1 : 4 and S6 : 3

Approach:

  1. Use LinkedHashSet with process name as key HashMap (this Map, message as key and its count as value). 
  2. Process the given log file, one line at a time.
  3. Get the timestamp, if it is in the given range then proceed else skip this line.
  4. check if process name exist in LinkedHashMap, 
    • If no then create HashMap with the message as key, value as 1.  Now insert this HashMap into LinkedHashMap as value with the process as key.
    • If yes then get HashMap from LinkedHashMap using the process as key. Check for the log message line in retrieved HashMap, if exist then increase its count else insert log message with count 1. 
  5. At the end iterate the LinkedHashMap to print the result.
  6. During this iteration, keep track of the top two processes (which has most log messages), also read find two largest elements in the array.

Output:

S1 - 1 ERROR / 1 DEBUG / 2 WARN /
S2 - 1 WARN /
S4 - 1 ERROR /
S5 - 1 DEBUG /
S6 - 1 INFO / 2 DEBUG /
S7 - 2 INFO /
Top 2 processes are S1 : 4 and S6 : 3