Filtering Logs by Date Range in Bash
This week I wanted to analyse a log file in a shell script. It was a single file containing months of records and I wanted to look at the last day, the last week and the last month separately. I needed a command that would filter out unwanted lines so I could pipe it into other tools like grep
and cut
.
I found a good solution using features in GNU date. This works out of the box on Linux but Mac would require installing the GNU coreutils using something like homebrew.
Every line contains a date in this format: 03/Jun/2018
. If I want the last week then I need to search for lines matching one of 03/Jun/2018
, 02/Jun/2018
, 01/Jun/2018
—and here’s where it gets tricky—31/May/2018
, and so on. What I need is a tool that already knows how calendars work.
GNU date does exactly what I need with the -d
option:
$ date -d "2018-06-03 - 7 days" +"%d/%b/%Y"
27/May/2018
I can leave out the starting date if I want to do it relative to the current time:
$ date -d "-7 days" +"%d/%b/%Y"
21/Jun/2018
The last step is to generate a series of dates and feed these as search terms to grep
. Grep can take a list of entries with the -f
option. I’ll send them via stdin:
$ for i in `seq 0 8`; do date -d "-$i days" +"%d/%b/%Y"; done | grep -f - file.log
(... log entries matching today or the previous 7 days ...)
Breaking that down a bit, it starts with a for loop:
for i in `seq 0 8`
do
date -d "-$i days" +"%d/%b/%Y"
done
This prints a list of dates like this:
28/Jun/2018
27/Jun/2018
26/Jun/2018
25/Jun/2018
24/Jun/2018
23/Jun/2018
22/Jun/2018
21/Jun/2018
20/Jun/2018
This is piped to the grep
command.
| grep -f - file.log
Since the filename file.log
is specified, it will open the file from disk rather than filtering stdin. The parameter -f
says that it should read the search terms from a file, and -
is a shorthand that means “use stdin”—so it uses the output of the for loop as the list of search terms.
To examine a different range, like the last month, change the number 8 to 31.