0% found this document useful (0 votes)
13 views15 pages

Unit 3 Linux Regular Expression

The document provides an overview of Linux regular expressions (regex), detailing their syntax, metacharacters, and usage in various tools like grep, sed, and rename. It explains how to match patterns, including options for concatenation, alternatives, and occurrences, along with examples for practical application. Additionally, it covers different regex versions and specific commands for string manipulation in Linux.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views15 pages

Unit 3 Linux Regular Expression

The document provides an overview of Linux regular expressions (regex), detailing their syntax, metacharacters, and usage in various tools like grep, sed, and rename. It explains how to match patterns, including options for concatenation, alternatives, and occurrences, along with examples for practical application. Additionally, it covers different regex versions and specific commands for string manipulation in Linux.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Linux Regular Expression

Regular expression is also called regex or regexp. It is a very powerful tool in Linux.

Regular expression is a pattern for a matching string that follows some pattern.

Regex can be used in a variety of programs like grep, sed, vi, bash, rename and many

more.

Regular Expression Metacharacters


A regular expression may have one or several repeating metacharacters.

Metacharacter Description

. Replaces any character.

^ Matches start of string and represents


characters not in the string.

$ Matches end of string.

* Matches zero or more times the preceding


character.

\ Represents the group of characters.

() Groups regular expressions.

? Matches exactly one character.

+ Matches one or more times the preceding


character.
{N} Preceding character is matched exactly N
times.

{N,} Preceding character is matched exactly N


times or more.

{N,M} Preceding character is matched exactly N


times, but not more than N times.

- Represents the range.

\b Matches empty string at the edge of a word.

\B Matches empty string if it is not at the edge of


a word.

\< Matches empty string at the beginning of a


word.

\> Matches empty string at the end of a word.

Regex Versions
There are three versions of regular expressions syntax:

o BRE : Basic Regular Expressions


o ERE : Extended Regular Expressions
o PRCE: Perl Regular Expressions

Depending on tool or programs, one or more of these versions can be used.

Linux grep Regular Expressions


The grep tool has the following options to use regular expressions:

o -E : String is read as ERE (Extended Regular Expressions)


o -G : String is read as BRE (Basic Regular Expressions)
o -P : String is read as PRCE (Perl Regular Expressions)
o -F : String is read literally.

Print Lines Matching A Pattern


The grep command will search for line that matches the specified pattern.

Syntax:

1. grep <pattern> <fileName>

Example:

1. grep t msg.txt
2. grep l msg.txt
3. grep v msg.txt

Look at the above snapshot, all the matching pattern lines are displayed and pattern

is highlighted.

Concatenating Characters
If a pattern is of concatenating characters then it has to be matched as it is, for the

line to be displayed.

Example:

1. grep tp msg.txt
2. grep in msg.txt
3. grep is msg.txt

Look at the above snapshot, lines matching exactly the specified patterns are

displayed.

One Or The Other


Here pipe (|) symbol is used as OR to signify one or the other. All the three versions

are shown. Options -E and -P syntax are same but -G syntax uses (\).

Syntax:

1. grep <option> <'pattern|pattern> <fileName>

Example:

1. grep -E 'j|g' msg.txt


2. grep -P 'j|g' msg.txt
3. grep -G 'j\|g' msg.txt
Look at the above snapshot, either pattern 'j' or 'g' should be matched to display

the lines.

One Or More / Zero Or More


The * signifies zero or more times occurence of a pattern and + signifies one or more

times occurence.

Syntax:

1. grep <option> <'pattern*'> <fileName>

Example:

1. grep -E '1*' list


2. grep -E '1+' list
Look at the above snapshot, * character displays zero or more times occurence of

pattern '1'. But + character displays one or more times occurence.

Match The End Of A String


To match the end of a string we use $ sign.

Syntax:

1. grep <pattern>$ <fileName>

Example:

1. grep r$ dupli.txt
2. grep e$ dupli.txt

Look at the above snapshot, lines are displayed matching the end of a string.
Match The Start Of A String
To match the start or beginning of a file we use caret sign (^).

Syntax:

1. grep ^<pattern> <fileName>

Example:

1. grep ^o dupli.txt

Look at the above snapshot, lines are displayed matching the start or beginning of

a string.

Separating Words
Syntax:

1. grep '\b<pattern>\b' <fileName>

Example:

1. grep '\bsome\b' file


Look at the above snapshot, by giving command "grep some file all the lines

matching to the word 'some' are displayed. But by giving command

"grep '\bsome\b' file" only lines matching single word 'some' are displayed.

Note: This can also be done with the help of -w option.

Syntax:

1. grep -w <pattern> <fileName>

Example:

1. grep -w some file

Look at the above snapshot, command "grep -w some file" displays the same

result as \b character.

Linux rename Regular Expressions


The rename command is mostly used to search a string and replace it with another string

Syntax:

1. rename 's/string/other string/'

Example:

1. rename 's/text/txt/' *

Look at the above snapshot, all the 'text' are converted into 'txt'.

You can also replace a string with the following syntax.

Syntax:

1. rename 's/string/other string/' * string

Example:

1. rename 's/txt/TXT/' *.txt

Look at the above snapshot, all '.txt' are converted into '.TXT'.

In above two examples the strings used were present only at the end of the file name.

But this example is different.

Example:

1. rename 's/txt/bbb/' atxt.txt


Look at the above snapshot, only the first occurence of sarched string is replaced.

A Global Replacement
In the above example only first 'txt' was replaced in 'atxt.txt'. To replace both the

'txt' we can use a global replacement 'g'.

Syntax:

1. rename 's/string/other string/g'

Example:

1. rename 's/txt/TXT/g' atxt.txt

Look at the above snapshot, both the 'txt' are replaced with 'TXT'.

Case Insensitive Replacement


In case insensitive replacement, a string can be replaced with a case insensitive string.

Syntax:

1. rename 's/string/other string/i'


Example:

1. rename 's/.text/.txt/i' *

Look at the above snapshot, all '.text' are replaced with '.txt'.

Linux Sed Regular Expressions


Stream Editor
The sed command is used for stream editing.

Example:

1. echo interactive | sed 's/inte/dist/'


2. echo interactive | sed 's:inte:dist:'
3. echo interactive | sed 's_inte_dist_'
4. echo interactive | sed 's|inte|dist|'

Look at the above snapshot, string 'interactive' is changed to 'distractive' with sed

command. Inspite of forward slash (/), colon (:), underscore (_) and pipe (|) will also
work.
Interactive Editor
The sed command is meant to be stream editor while it can also be used as interactive

editor on a file. For interactive editor option 'i' is used.

Look at the above snapshot, stream 'today' is converted into 'tomorrow' in the 'file'.

Simple Back Referencing


Double ampersand is used to search and find the specified string. It will print the found

string with sed command.

Look at the above snapshot, ampersand has searched the string 'four' and printed it

as 'fourfourty'.

A Dot For Any Character


In regex a simple dot can signify any character.,/p>
Look at the above snapshot, dots are replaced by the date format.

Multiple Back Referencing


When more than one pair of parenthesis is used it is called grouping. Here each of them

can be referenced separately as three consecutive numbers.

Look at the above snapshot, date is printed in different formats. Here, 2014 is

as (1), 06 is refernced as (2) and 30 is referenced as (3).

White Space
The white space syntax is '\s' and tab space syntax is '\t'.
Look at the above snapshot, '\s' is used for a single space.

Optional Occurrence
You can specify something optional by specifying it with (?) question mark.

Look at the above snapshot, we have made third 'i' as optional. It mens that two 'i' are

must to be converted into 'Y'.

Exact n Times Occurence


Exact times occurence is specified by "{times}".

Look at the above snapshot, we have specified exactly three times occurence of 'i'.
Occurence In Range
We can specify occurence in terms of range also. For example, if we'll specify range as

{m,n}, then 'm' denotes minimum times occurence and 'n' denotes maximum times

occurence.

Look at the above snapshot, we have specified minimum range as 3 and maximum

range as 4.

You might also like