0% found this document useful (0 votes)
22 views5 pages

Reg Expressions

The document discusses regular expressions in grep and provides examples of using different regex patterns and constructs in grep including literal matches, anchors, single character matching, bracket expressions, quantifiers, alternation, grouping and special backslash expressions.

Uploaded by

Eddie Peter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views5 pages

Reg Expressions

The document discusses regular expressions in grep and provides examples of using different regex patterns and constructs in grep including literal matches, anchors, single character matching, bracket expressions, quantifiers, alternation, grouping and special backslash expressions.

Uploaded by

Eddie Peter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Regular expressions in grep ( regex ) with

examples
grep is one of the most useful and powerful commands in Linux for text processing. grep
searches one or more input files for lines that match a regular expression and writes each matching
line to standard output.

Grep Regular Expression


A regular expression or regex is a pattern that matches a set of strings. A pattern consists of
operators, constructs literal characters, and meta-characters, which have special meaning. GNU
grep supports three regular expression syntaxes, Basic, Extended, and Perl-compatible.

In its simplest form, when no regular expression type is given, grep interpret search patterns as
basic regular expressions. To interpret the pattern as an extended regular expression, use the -E ( or
--extended-regexp) option.

In GNU’s implementation of grep there is no functional difference between the basic and extended
regular expression syntaxes. The only difference is that in basic regular expressions the meta-
characters ?, +, {, |, (, and ) are interpreted as literal characters. To keep the meta-characters’
special meanings when using basic regular expressions, the characters must be escaped with a
backslash (\). We will explain the meaning of these and other meta-characters later.

Generally, you should always enclose the regular expression in single quotes to avoid the
interpretation and expansion of the meta-characters by the shell.

Literal Matches
The most basic usage of the grep command is to search for a literal character or series of
characters in a file. For example, to display all the lines containing the string “bash” in the
/etc/passwd file, you would run the following command:
grep bash /etc/passwd
The output should look something like this:
root:x:0:0:root:/root:/bin/bash
linuxize:x:1000:1000:linuxize:/home/linuxize:/bin/bash

In this example, the string “bash” is a basic regular expression that consists of a four literal
characters. This tells grep to search for a string that has a “b” immediately followed by “a”, “s”,
and “h”.
By default, the grep command is case sensitive. This means that the uppercase and lowercase
characters are treated as distinct.
To ignore case when searching, use the -i option (or --ignore-case).
It is important to note that grep looks for the search pattern as a string, not a word. So if you were
searching for “gnu”, grep will also print the lines where “gnu” is embedded in larger words, such
as “cygnus” or “magnum.
If the search string includes spaces, you need to enclose it in single or double quotation marks

Object 3

grep "Gnome Display Manager" /etc/passwd

Anchoring
Anchors are meta-characters that that allow you to specify where in the line the match must be
found.
The ^ (caret) symbol matches the empty string at the beginning of a line. In the following example,
the string “linux” will match only if it occurs at the very beginning of a line.
grep '^linux' file.txt

The $ (dollar) symbol matches the empty string at the beginning of a line. To find a line that ends
with the string “linux”, you would use:
grep 'linux$' file.txt
You can also construct a regular expression using both anchors. For example, to find lines
containing only “linux”, run:
grep '^linux$' file.txt

Another useful example is the ^$ pattern that matches all empty lines.

Matching Single Character


The . (period) symbol is a meta-character that matches any single character. For example, to match
anything that begins with “kan” then has two characters and ends with the string “roo”, you would
use the following pattern:
grep 'kan..roo' file.txt

Bracket Expressions
Bracket expressions allows match a group of characters by enclosing them in brackets []. For
example, find the lines that contain “accept” or “accent”, you could use the following expression:
grep 'acce[np]t' file.txt

If the first character inside the brackets is the caret ^, then it matches any single character not
enclosed in the brackets. The following pattern will match any combination of strings starting with
“co” followed by any letter except “l” followed by “la”, such as “coca”, “cobalt” and so on, but will
not match the lines containing “cola”:

Object 6

grep 'co[^l]a' file.txt

Instead of placing characters one by one, you can specify a range of characters inside the brackets.
A range expression is constructed by specifying the first and last characters of the range separated
by a hyphen. For example, [a-a] is equivalent to [abcde] and [1-3] is equivalent to [123].

The following expression matches each line that starts with a capital letter:
grep '^[A-Z]' file.txt

grep also support predefined classes of characters that are enclosed in brackets. The following
table shows some of the most common character classes:

Quantifier Character Classes


[:alnum:] Alphanumeric characters.
[:alpha:] Alphabetic characters.
[:blank:] Space and tab.
[:digit:] Digits.
[:lower:] Lowercase letters.
[:upper:] Uppercase letters.
For a complete list of all character classes check the Grep manual .

Quantifiers
Quantifiers allow you to specify the number of occurrences of items that must be present for a
match to occur. The following table shows the quantifiers supported by GNU grep:

Object 7

Quantifier Description
* Match the preceding item zero or more times.
? Match the preceding item zero or one time.
+ Match the preceding item one or more times.
{n} Match the preceding item exactly n times.
{n,} Match the preceding item at least n times.
Quantifier Description
{,m} Match the preceding item at most m times.
{n,m} Match the preceding item from n to m times.
The * (asterisk) character matches the preceding item zero or more times. The following will match
“right”, “sright” “ssright” and so on:
grep 's*right'

Below is more advanced pattern that matches all lines that starts with capital letter and ends with
either period or comma. The .* regex matches any number of any characters:
grep -E '^[A-Z].*[.,]$' file.txt

The ? (question mark) character makes the preceding item optional and it can match only once. The
following will match both “bright” and “right”. The ? character is escaped with a backslash because
we’re using basic regular expressions:
grep 'b\?right' file.txt

Here is the same regex using extended regular expression:


grep -E 'b?right' file.txt

The + (plus) character matches the preceding item one or more times. The following will match
“sright” and “ssright”, but not “right”:
grep -E 's+right' file.txt

The brace characters {} allows you to specify the exact number, an upper or lower bound or a
range of occurrences that must occur for a match to happen.
The following matches all integers that have between 3 and 9 digits:
grep -E '[[:digit:]]{3,9}' file.txt

Alternation
The term alternation is a simple “OR”. The alternation operator | (pipe) allows you to specify
different possible matches that can be literal strings or expression sets. This operator has the lowest
precedence of all regular expression operators.
In the example below, we are searching for all occurrences of the words fatal, error, and
critical in the Nginx log error file:
grep 'fatal\|error\|critical' /var/log/nginx/error.log

If you use the extended regular expression, then the operator | should not be escaped, as shown
below:
grep -E 'fatal|error|critical' /var/log/nginx/error.log
Grouping
Grouping is a feature of the regular expressions that allows you to group patterns together and
reference them as one item. Groups are created using parenthesis ().

When using basic regular expressions, the parenthesis must be escaped with a backslash (\).

The following example matches both “fearless” and “less”. The ? quantifier makes the (fear)
group optional:
grep -E '(fear)?less' file.txt

Special Backslash Expressions


GNU grep includes several meta-characters that consist of a backslash followed by a regular
character. The following table shows some of the most common special backslash expressions:

Expression Description
\b Match a word boundary.
\< Match an empty string at the beginning of a word.
\> Match an empty string at the end of a word.
\w Match a word.
\s Match a space.
The following pattern will match separate words “abject” and “object”. It will not match the words
if embedded in larger words:
grep '\b[ao]bject\b' file.txt

Conclusion
Regular expressions are used in text editors, programming languages, and command-line tools such
as grep, sed, and awk . Knowing how to construct regular expressions can be very helpful when
searching text files, writing scripts, or filtering command output.

You might also like