grep is a CLI tool created in 1974 that is pre-installed on any Linux/Unix-like/BSD* system.
grep is an acronym and stands for "globally search for a regular expression and print matching lines" - quite a mouth full but a good description of what
grep does. The more I worked in the Sys-/DevOps domain, the more I found myself using
awk to select parts of files for further processing. In this article I want to highlight some things
grep can do that I didn't know about but wish I did when I started using it.
1. Grep can open files
Contrary to popular believe, you do not need
cat file.txt | grep pattern -
grep can open files just fine!
A lot of tutorials show
grep reading files opened by
cat. Often times you'll find code like the following:
$ cat app.log | grep "connection refused"
cat command is unnecessary here and can be omitted.
# grep pattern file
$ grep "connection refused" app.log
2. The different grep "variants"
There isn't just grep but variants of it with different purposes. The man page says the following:
grep is used for simple patterns and basic regular expressions (BREs); egrep can handle extended regular expressions (EREs) fgrep is quicker than both grep and egrep, but can only handle fixed patterns (i.e. it does not interpret regular expressions).
So there is actually
fgrep! grep is the "should-work" version that can probably handle most cases, like getting all lines containing the word
grep "ERROR" app.log) or finding lines that start with a specific word (
grep "^FATAL" app.log).
egrep is used for more advanced regular expressions, like finding all lines that contain one or more keywords
egrep "(ERROR|WARN|INFO)" app.log -
grep would only find the literal string "ERROR|WARN|INFO", egrep will find either
fgrep will find things fast, but doesn't use any regular expression. It's best when dealing with large files and no regex is needed, but I've never actually used it.
3. Only print non-matching lines
grep can show all lines except those matching the search pattern - so basically the opposite of what it usually does. This mode is enabled with
The following example
$ egrep -v "(INFO|WARN)" app.log
DEBUG: my_var = 12
ERROR: Oh no, my_var != 42
4. Follow and filter output of a file with
tail is another CLI program that can be used to show lines of a file starting from the end (the opposite of the
head command). For example,
head -n 20 shows the last 20 lines of a file.
tail is incredibly useful in debugging running applications because it has an option to follow (
-f) which means it will print out the log file as it is being written.
What I did not know is that
tail -f can be combined with
grep to filter the log file the moment it is being written!
# follow the log output and filter for lines containing `Connection refused`
$ tail -f app.log | grep "Connection refused"
5. Show surrounding lines
grep has two parameters to show surrounding lines:
-B shows lines Before,
-A shows lines After the match.
To illustrate this, assume we have a file with the following content:
Hello world, this is some text to be used with the grep surrounding example so we need to fill a few lines.
Now we do a grep searching for
example and the two lines above the match, which gives us the following
$ grep -B 2 "example" example.txt
some text to be used
with the grep surrounding
example so we need to fill
-A are incredibly useful when we care for the surroundings of a match.
And that's it for today! I hope any of the tips above have been helpful.