Log Monitoring and Analysis using Bash Script: Step-By-Step Guide

Log Monitoring and Analysis using Bash Script: Step-By-Step Guide

Imagine you’re a DevOps engineer responsible for managing a complex web application deployed on a cloud infrastructure. One day, you receive a frantic call from the operations team reporting a critical issue with the application. Without hesitation, you dive into the logs to identify the root cause of the problem. Hours pass as you manually sift through thousands of log entries, desperately searching for clues. Frustrated and exhausted, you realize there must be a better way to monitor and analyze logs.

Fortunately, there is. Let’s explore how to build a log monitoring and analysis script using Bash scripting. With this script, you can automate the process of monitoring log files, detecting patterns or anomalies, and generating actionable insights to streamline troubleshooting and enhance system reliability.

Step 1: Let’s define the requirements for our log monitoring and analysis script. We need a script that can:

  • Continuously monitor a specified log file for new entries.

  • Detect specific patterns or keywords within the log entries, such as error messages or warnings.

  • Generate summary reports based on the analysis.

  • Handle errors and exceptions gracefully.

Step 2: Setting Up Environment: Ensure you have a Unix-like operating system with Bash installed. You’ll also need sample log files for testing purposes.

# Check Bash version
bash --version

# Create sample log file
touch sample-log.txt

Step 3: Writing the Script:

  1. Setting Variables:
#!/bin/bash

# path to the log file
log_file="/Users/janmeshSingh/Files/Marketfeed/sample-log.txt"

# modification time of the log file
prev_mod_time=$(stat -c %Y "$log_file")

In this section, we initialize two variables:

  • log_file: Specifies the path to the log file that the script will monitor.

  • prev_mod_time: Stores the modification time of the log file. This will be used to track changes in the log file.

2. Function to Analyze the Log File:

analyze_log() {
echo "=== Log Analysis Summary ==="

# counting errors
 count_errors=$(grep -c -Ei "Error|failed" "$log_file")

# counting warnings
 warning=$(grep -c -Ei "Warning" "$log_file") 


# printing final analysis report 
      echo "Log file name: $(basename "$log_file")"
      echo "Total lines processed: $(wc -l < "$log_file")"
      echo "Total error: $count_errors"
      echo "Total Warnings: $warning"
}

This function, analyze_log(), is responsible for analyzing the log file. It counts the occurrences of specific patterns (such as errors and warnings) within the log file and prints a summary report. Here, the script uses grep to search for errors and warnings within the log file. The -c option is used to count the occurrences of each pattern.

3. Main Loop to Monitor the Log File:

# Main loop to monitor the log file
echo "Monitoring log file: $log_file"
while true; do

    # Checking if the log file exists
    if [ ! -f "$log_file" ]; then
        echo "Error: Log file '$log_file' not found."
        exit 1
    fi

    # Checking if the log file is readable
    if [ ! -r "$log_file" ]; then
        echo "Error: Log file is not readable."
        return 1
    fi
    # Check if the file has been modified by comparing the modification timestamps
    curr_mod_time=$(stat -c %Y "$log_file")
    if [ $curr_mod_time -gt $prev_mod_time ]; then
        clear  
        # Clearing the screen and displaying the last 10 lines of the log file
        tail -n 10 "$log_file"
        analyze_log
        # updating the timestamps
        prev_mod_time=$curr_mod_time
    fi
        # delay of 1 second before checking again
    sleep 1  
done

The main loop continuously monitors the log file for changes. It checks if the log file exists, is readable, and has been modified since the last check. To check if the file is modified or not, we compare the modification time and current time and if any changes are detected, it clears the screen, and then displays the last 10 lines of the log file, and calls the analyze_log() function to perform log analysis.

4. Delay and Graceful Exit:

sleep 1

The sleep command introduces a delay of 1 second between each iteration of the main loop, preventing the script from consuming excessive resources.

trap 'echo -e "\nExiting..."; exit 0' SIGINT

This line sets up a trap to catch the SIGINT signal (Ctrl+C) and exit the script gracefully when the user interrupts it.

By breaking down the script into these components, we can clearly define the script’s behavior and functionality. This modular approach also makes it easier to maintain and extend the script in the future as requirements evolve.

Step 4: Testing and Validation

Test the script with sample log files to ensure it behaves as expected.

Find the testing script here

Step 5: Refinement and Optimization

Optimize the script for performance and efficiency based on feedback and real-world usage scenarios.

Find the entire script for monitoring and analyzing here: Click here

In conclusion, building a log monitoring and analysis script in Bash empowers DevOps engineers to proactively monitor and analyze log files, enabling faster issue resolution and improved system reliability. By following the steps outlined in this guide, you can create a robust and effective script tailored to your specific monitoring needs.

For more such articles, follow me on my public profiles:

LinkedIn: linkedin.com/in/janmeshsingh00

GitHub: github.com/janmeshjs

Twitter: twitter.com/janmeshsingh2

Happy scripting!