Logs and Metrics in Splunk

This can be anything from error messages to website traffic stats to how many times someone clicked on a button.

Now, let’s say you have all this data sitting in some kind of log file (which is just a fancy way of saying “text document”) and you want to see what’s going on with it. That’s where Splunk comes in! It’s like a super-powered search engine for your logs, allowing you to filter out the noise and focus on the important stuff.

For example, let’s say you have a log file that looks something like this:

// This script is used to demonstrate the functionality of Splunk, a search engine for logs that helps filter out unnecessary information and focus on important data.

// The following code represents a log file with three entries, each containing a timestamp, log level, and message.

// The first entry is an error message indicating a failed connection to a database.

2019-05-31 14:27:36 [ERROR] Failed to connect to database: Connection refused

// The second entry is a warning message indicating low memory on the server and a request to free up resources.

2019-05-31 14:28:02 [WARNING] Server is running low on memory, please free up some resources

// The third entry is an informational message indicating a successful connection to the database.

2019-05-31 14:28:17 [INFO] Successfully connected to database

If you wanted to see all the errors that happened between May 31st and June 1st at 1pm (in your local timezone), you could run a search like this in Splunk:

// This script searches for all errors that occurred between May 31st and June 1st at 1pm in the local timezone.
// It uses the Splunk search language to query the "my_logs" index and filter for events with the "application" sourcetype.
// The "where" command is used to filter for events with a timestamp between May 31st at 1pm and June 1st at 2pm.
// The "level" field is then filtered for events with a value of "ERROR".
index=my_logs sourcetype="application" 
| where timestamp >= "2019-05-31T13:00:00Z" AND timestamp <= "2019-06-01T14:00:00Z" // Filters for events with a timestamp between May 31st at 1pm and June 1st at 2pm.
| where level = "ERROR" // Filters for events with a "level" field value of "ERROR".

This would return all the log entries that match those criteria, which in this case would be the first line of our example. Pretty cool, right? But what if you wanted to see how many errors happened during a specific time period? That’s where metrics come in!

Metrics allow you to perform calculations on your data and get some insights into trends or patterns that might not be immediately obvious from just looking at the raw logs. For example, let’s say we want to know how many errors occurred per hour over the course of a day:


# This script uses Splunk to analyze log data and calculate the number of errors that occurred per hour over the course of a day.

# First, we specify the index and sourcetype of our log data.
index=my_logs sourcetype="application" 

# Next, we filter the data to only include logs with the "ERROR" level.
| where level = "ERROR" 

# Then, we use the "stats" command to perform calculations on our data.
# We use the "count" function to count the number of logs in each time bin.
# The "by" clause specifies the field to group the data by, in this case, the timestamp.
# We use the "bin" function to group the data into 1-hour time bins.
# The "span" option specifies the time range to analyze, in this case, the last day (-1d).
| stats count by bin(timestamp, 1h) span=-1d

This would return a table that looks something like this:

| _time | count |
|—————————-|——-|
| 2019-05-31T14:00:00Z | 1 |
| 2019-05-31T15:00:00Z | 2 |
| 2019-05-31T16:00:00Z | 4 |
| … | … |

As you can see, we’re using the `stats` command to count the number of errors that occurred within each hour (using the `bin()` function), and then grouping them by time using the `by` keyword. The `span=-1d` option tells Splunk to look for data in the last day.

Logs and metrics in Splunk, made simple(ish)! Of course, this is just scratching the surface of what’s possible with Splunk there are all kinds of advanced features and techniques that can help you get even more insights from your data. But hopefully this gives you a good starting point to explore on your own!

SICORPS