0% found this document useful (0 votes)

25 views38 pages

L5 - Reg Exp

The document discusses various UNIX commands used for text processing and manipulation including regular expressions, cut, paste, sed, tr, grep, and sort. It provides details on the format and examples of how each command can be used.

Uploaded by

gauri Varshney

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views38 pages

L5 - Reg Exp

Uploaded by

gauri Varshney

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 38

Regular expressions

 Used by several different UNIX commands,

including ed, sed, awk, grep
 A period ‘.’ matches any single characters
 .X. matches any X that is surrounded by any two
characters
 Caret character ^ matches the beginning of the
line
 ^Bridgeport matches the characters Bridgeport
only if they occur at the beginning of the line
Regular expressions (continue.)

 A dollar sign ‘$’ is used to match the end of the

line
 Bridgeport$ will match the characters Bridgeport
only they are the very last characters on the line
 $ matches any single character at the end of the
line
 To match any single character, this character
should be preceded by a backslash ‘\’ to remove
the special meaning
 \.$ matches any line end with a period
Regular expressions (continue.)
 ^$ matches any line that contains no characters
 […] is used to match any character enclosed in […]
 [tT] matches a lower or upper case t followed
immediately by the characters
 [A-Z] matches upper case letter
 [A-Za-z] matches upper or lower case letter
 [^A-Z] matches any character except upper case
letter
 [^A-Za-z] matches any non alphabetic character
Regular expressions (continue.)

 (*) Asterisk matches zero or more characters

 X* matches zero, one, two, three, … capital X’s
 XX* matches one or more capital X’s
 .* matches zero or more occurrences of any characters
 e.*e matches all the characters from the first e in the
line to the last one
 [A-Za-z] [A-Za-z] * matches any alphabetic character
followed by zero or more alphabetic character
Regular expressions (continue.)

 [-0-9] matches a single dash or digit character

(ORDER IS IMPORTANT)
 [0-9-] same as [-0-9]
 [^-0-9] matches any alphabetic except digits and dash
 []a-z] matches a right bracket or lower case letter
(ORDER IS IMPORTANT)
Regular expressions (continue.)
 \{min, max\} matches a precise number of characters
 min specifies the minimum number of occurrences
of the preceding regular expression to be matched,
and max specifies the maximum
 w\{1,10\} matches from 1 to 10 consecutive w’s
 [a-zA-Z]\{7\} matches exactly seven alphabetic
characters
Regular expressions (continue.)
 X\{5,\} matches at least five consecutive X’s
 $….) is used to save matched characters
 ^\(.$ matches the first character on the line and
store it into register one
 There is 1-9 registers
 To retrieve what is stored in any register \n is used
 Example: ^$.$\1 matches the first two characters
on a line if they are both the same characters
Regular expressions (continue.)

 ^$.$.*\1$ matches all lines in which the first

character on the line is the same as the last.
Note (.*) matches all the characters in-between

 ^$…)\(…$ the first three characters on the line

will be stored into register 1 and the next three
characters into register 2
cut
Used in extracting various fields of data from a data file or the
output of a command

$ who
bgeorge pts/16 Oct 5 15:01 (216.87.102.204)
abakshi pts/13 Oct 6 19:48 (216.87.102.220)
tphilip pts/11 Oct 2 14:10 (AC8C6085.ipt.aol.com)
$ who | cut -c1-8,18-
bgeorge Oct 5 15:01 (216.87.102.204)
abakshi Oct 6 19:48 (216.87.102.220)
tphilip Oct 2 14:10 (AC8C6085.ipt.aol.com)
$
Format: cut -cchars file
 chars specifies what characters to extract from each line of file.
cut (continue.)
 Example: -c5, -c1,3,4 -c-10-15 -c5-
 The –d and –f options are used with cut when
you have data that is delimited by a particular
character
 Format: cut –ddchars –ffields file
 dchar: delimiters of the fields (default: tab
character)
 fields: fields to be extracted from file
cut (continue.)
$ cat phonebook
Edward 336-145
Alice 334-121
Sony 332-336
Robert 326-056

$ cut -f1 phonebook

Edward
Alice
Sony
Robert

$
cut (continue.)
$ cat /etc/passwd
root:x:0:1:Super-User:/:/sbin/sh
daemon:x:1:1::/:
bin:x:2:2::/usr/bin:
sys:x:3:3::/:
adm:x:4:4:Admin:/var/adm:
lp:x:71:8:Line Printer Admin:/usr/spool/lp:
uucp:x:5:5:uucp Admin:/usr/lib/uucp:
listen:x:37:4:Network Admin:/usr/net/nls:
nobody:x:60001:60001:Nobody:/:
noaccess:x:60002:60002:No Access User:/:
oracle:*:101:67:DBA Account:/export/home/oracle:/bin/csh
webuser:*:102:102:Web User:/export/home/webuser:/bin/csh
abuzneid:x:103:100:Abdelshakour Abuzneid:/home/abuzneid:/sbin/csh
$
cut (continue.)
$ cut -d: -f1 /etc/passwd
root
daemon
bin
sys
adm
lp
uucp
nuucp
listen
nobody
oracle
webuser
abuzneid
$
paste
 Format: paste files
 tab character is a default delimiter
paste (continue.)
 Example:
$ cat students
Sue
Vara
Elvis
Luis
Eliza
$ cat sid
578426
452869
354896
455468
335123
$ paste students sid
Sue 578426
Vara 452869
Elvis 354896
Luis 455468
Eliza 335123
$
paste (continue.)

 The option –s tells paste to paste together

lines from the same file not from alternate
files
 To change the delimiter, -d option is used
paste (continue.)
 Examples:
$ paste -d '+' students sid
Sue+578426
Vara+452869
Elvis+354896
Luis+455468
Eliza+335123

$ paste -s students
Sue Vara Elvis Luis Eliza

$ ls | paste -d ' ' -s -

addr args list mail memo name nsmail phonebook programs roster sid

students test tp twice user

$
sed
 sed (stream editor) is a program used for editing
data
 Unlike ed, sed can not be used interactively
 Format: sed command file
 command: applied to each line of the specified file
 file: if no file is specified, then standard input is
assumed
 sed writes the output to the standard output
 s/Unix/UNIX command is applied to every line in
the file, it replaces the first Unix with UNIX
sed (continue.)
 sed makes no changes to the original input file
 ‘s/Unix/UNIX/g’ command is applied to every line
in the file. It replaces every Unix with UNIX. “g”
means global
 With –n option, selected lines can be printed
 Example: sed –n ’1,2p’ file which prints the first
two lines
 Example: sed –n ‘/UNIX/p’ file, prints any line
containing UNIX
sed (continue.)

 Example: sed –n ‘/1,2d/’ file, deletes lines 1

and 2
 Example: sed –n’ /1’ text, prints all lines
from text,

showing non printing characters as \nn and

tab characters as “>”
tr
 The tr filter is used to translate characters from
standard input
 Format: tr from-chars to-chars
 Result is written to standard output
 Example tr e x <file, translates every “e” in file to
“x” and prints the output to the standard output
 The octal representation of a character can be
given to “tr” in the format \nnn
 Example: tr : ‘\11’ will translate all : to tabs
tr (continue.)

Character Octal value

Bell 7
Backspace 10
Tab 11
New line 12
Linefeed 12
Form feed 14
Carriage return 15
Escape 33
tr (continue.)
 Example: tr ‘[a-z]’’[A-Z]’ < file translate all
lower case letters in file to their uppercase
equivalent. The characters ranges [a-z] and
[A-Z] are enclosed in quotes to keep the
shell from replacing them with all files
named from a through z and A through Z
 To “squeeze” out multiple occurrences of
characters the –s option is used
tr (continue.)

 Example: tr –s ’ ’ ‘ ‘ < file will squeeze multiple

spaces to one space
 The –d option is used to delete single characters
from a stream of input
 Format: tr –d from-chars
 Example: tr –d ‘ ‘ < file will delete all spaces from
the input stream
grep

 Searches one or more files for a particular

characters patterns
 Format: grep pattern files
 Example: grep path .cshrc will print every line
in .cshrc file which has the pattern ‘path’ and print
it
 Example: grep bin .cshrc .login .profile will print
every line from any of the three files .cshrc, .login
and .profile which has the pattern “bin”
grep (continue.)

 Example : grep * smarts will give an

error because * will be substituted with
all file in the correct directory
 Example : grep ‘*’ smarts
arguments *
grep
smarts
sort
 By default, sort takes each line of the specified input file and
sorts it into ascending order
$ cat students
Sue
Vara
Elvis
Luis
Eliza

$ sort students
Eliza
Elvis
Luis
Sue
Vara

$
sort (continue.)
 The –n option tells sort to eliminate
duplicate lines from the output
sort (continue.)
$ echo Ash >> students
$ cat students
Sue
Vara
Elvis
Luis
Eliza
Ash
Ash

$ sort students
Ash
Ash
Eliza
Elvis
Luis
Sue
Vara
sort (continue.)
 The –s option reverses the order of the sort
 The –o option is used to direct the input from the
standard output to file
 sort students > sorted_students works as sort
students –o sorted_students
 The –o option allows to sort file and saves the output
to the same file
 Example:
sort students –o students correct
sort students > students incorrect
sort (continue.)
• The –n option specifies the first field for sort
as number and data to sorted arithmetically
sort (continue.)

$ cat data
-10 11
15 2
-9 -3
2 13
20 22
3 1

$ sort data
-10 11
-9 -3
15 2
2 13
20 22
3 1

$
sort (continue.)
 To sort by the second field +1n should be used
instead of n. +1 says to skip the first field
 +5n would mean to skip the first five fields on
each line and then sort the data numerically
sort (continue.)

 Example
$ sort -t: +2n /etc/passwd
root:x:0:1:Super-User:/:/sbin/sh
daemon:x:1:1::/:
bin:x:2:2::/usr/bin:
sys:x:3:3::/:
adm:x:4:4:Admin:/var/adm:
uucp:x:5:5:uucp Admin:/usr/lib/uucp:
nuucp:x:9:9:uucp Admin:/var/spool/uucppublic:/usr/lib/uucp/uucico
listen:x:37:4:Network Admin:/usr/net/nls:
lp:x:71:8:Line Printer Admin:/usr/spool/lp:
oracle:*:101:67:DBA Account:/export/home/oracle:/bin/csh
webuser:*:102:102:Web User:/export/home/webuser:/bin/csh
y:x:60001:60001:Nobody:/:
$
uniq
 Used to find duplicate lines in a file
 Format: uniq in_file out_file
 uniq will copy in_file to out_file removing
any duplicate lines in the process
 uniq’s definition of duplicated lines are
consecutive-occurring lines that match
exactly
uniq (continue.)

 The –d option is used to list duplicate lines

 Example:
$ cat students
Sue
Vara
Elvis
Luis
Eliza
Ash
Ash
$ uniq students
Sue
Vara
Elvis
Luis
Eliza
Ash
$
References
 UNIX SHELLS BY EXAMPLE BY ELLIE
QUIGLEY
 UNIX FOR PROGRAMMERS AND USERS
BY G. GLASS AND K ABLES
 UNIX SHELL PROGRAMMING BY S.
KOCHAN AND P. WOOD

Michael Miller - Absolute Beginner's Guide Computer Basics, Windows 11 Edition, 10th Edition-Que Publishing - Pearson Education (2023)
100% (3)
Michael Miller - Absolute Beginner's Guide Computer Basics, Windows 11 Edition, 10th Edition-Que Publishing - Pearson Education (2023)
551 pages
Unix Basics: TIBCO Software Inc
100% (1)
Unix Basics: TIBCO Software Inc
34 pages
Advanced_Unix_Commands-tmp.pptx
No ratings yet
Advanced_Unix_Commands-tmp.pptx
30 pages
3 CPS393 PipesFilteringScripts
No ratings yet
3 CPS393 PipesFilteringScripts
75 pages
Awk_one-liners
No ratings yet
Awk_one-liners
58 pages
SW LAB 10 Filter
No ratings yet
SW LAB 10 Filter
45 pages
UNIT-3 USP
No ratings yet
UNIT-3 USP
82 pages
Kernel: It Is The Core of The UNIX Operating System. It Allocates The Time and Memory To
No ratings yet
Kernel: It Is The Core of The UNIX Operating System. It Allocates The Time and Memory To
8 pages
Unix Basic Commands
No ratings yet
Unix Basic Commands
6 pages
Module 5
No ratings yet
Module 5
14 pages
Unix Suggestion
No ratings yet
Unix Suggestion
4 pages
Week 5 Bash
No ratings yet
Week 5 Bash
63 pages
Lab03.Processing Text Streams
No ratings yet
Lab03.Processing Text Streams
12 pages
Unix Important Command
No ratings yet
Unix Important Command
3 pages
Unix Shell Scripting Chapter - 1: List Files That Begin With A Lowercase Letter and Don't End With A Digit
No ratings yet
Unix Shell Scripting Chapter - 1: List Files That Begin With A Lowercase Letter and Don't End With A Digit
10 pages
Bash Scripting Session - 2
No ratings yet
Bash Scripting Session - 2
52 pages
Unit 3 Linux Regular Expression
No ratings yet
Unit 3 Linux Regular Expression
15 pages
Systems Lab MCCS1.8 Cycle-1 1.unix Commands: A. Text Processing and Backup Utilities
No ratings yet
Systems Lab MCCS1.8 Cycle-1 1.unix Commands: A. Text Processing and Backup Utilities
66 pages
DAC - COS - Last Day Slides
No ratings yet
DAC - COS - Last Day Slides
73 pages
Pipingfile
No ratings yet
Pipingfile
11 pages
Linux Lecture 18
No ratings yet
Linux Lecture 18
21 pages
POP Lab Manual For Enginnering 1st Semester
No ratings yet
POP Lab Manual For Enginnering 1st Semester
43 pages
Final Study Notes
No ratings yet
Final Study Notes
36 pages
Unix Head and Tail Commands
No ratings yet
Unix Head and Tail Commands
8 pages
Sedbook
No ratings yet
Sedbook
16 pages
Commands
No ratings yet
Commands
20 pages
Lecture14 Unix Advanced Commands
No ratings yet
Lecture14 Unix Advanced Commands
13 pages
Linux
No ratings yet
Linux
7 pages
UNIX II:grep, Awk, Sed: October 30, 2017
No ratings yet
UNIX II:grep, Awk, Sed: October 30, 2017
26 pages
UNIX Filters
No ratings yet
UNIX Filters
18 pages
Unix Commands
No ratings yet
Unix Commands
76 pages
Real Vision Corporate Brochure
100% (1)
Real Vision Corporate Brochure
22 pages
Chapter 4 - Regular Expression
No ratings yet
Chapter 4 - Regular Expression
6 pages
OS Lab by Raushan Sir
No ratings yet
OS Lab by Raushan Sir
173 pages
Sheets
No ratings yet
Sheets
5 pages
Bash Ch01
No ratings yet
Bash Ch01
14 pages
Regex Cheatsheet
No ratings yet
Regex Cheatsheet
6 pages
Linux CLInotes
No ratings yet
Linux CLInotes
15 pages
Introduction To Unix1.2
No ratings yet
Introduction To Unix1.2
216 pages
Unix Suggestion
No ratings yet
Unix Suggestion
3 pages
Module 5
No ratings yet
Module 5
13 pages
9303 p3 v2 Cons en
No ratings yet
9303 p3 v2 Cons en
149 pages
UNIT-4: Filters
No ratings yet
UNIT-4: Filters
30 pages
Linux Commands
No ratings yet
Linux Commands
33 pages
Software Carpentry
No ratings yet
Software Carpentry
83 pages
UNIX Shells by Example (PDFDrive)
No ratings yet
UNIX Shells by Example (PDFDrive)
1,194 pages
Securing The Software Supply Chain
No ratings yet
Securing The Software Supply Chain
32 pages
Course3 Fragments and Executor Sept24
No ratings yet
Course3 Fragments and Executor Sept24
59 pages
Gtu Dissertation 2014
100% (3)
Gtu Dissertation 2014
6 pages
Irlr2908 SMD Datasheet 1
No ratings yet
Irlr2908 SMD Datasheet 1
11 pages
Unix 1
No ratings yet
Unix 1
50 pages
Filer Command
No ratings yet
Filer Command
38 pages
Awk One-Liners Explained (Preview Copy)
No ratings yet
Awk One-Liners Explained (Preview Copy)
12 pages
Unix Utilities: Grep, Sed, and Awk
100% (1)
Unix Utilities: Grep, Sed, and Awk
81 pages
Basic Filters & Pipes
No ratings yet
Basic Filters & Pipes
33 pages
OpenShift Container Platform-4.6-Installing On vSphere-en-US
No ratings yet
OpenShift Container Platform-4.6-Installing On vSphere-en-US
189 pages
Basic Computer OrganizationCommon Bus System and Instructions
No ratings yet
Basic Computer OrganizationCommon Bus System and Instructions
9 pages
Shell Scripting
No ratings yet
Shell Scripting
109 pages
L4 Grep
No ratings yet
L4 Grep
42 pages
PCS-9613 X Instruction Manual en Customized ECKF140565 R1.01
0% (1)
PCS-9613 X Instruction Manual en Customized ECKF140565 R1.01
274 pages
20.10 Filters-Text Processing Commands
No ratings yet
20.10 Filters-Text Processing Commands
14 pages
Ch-4 Ethical and Social Issues in IS
No ratings yet
Ch-4 Ethical and Social Issues in IS
31 pages
Training Plan VCI Template
No ratings yet
Training Plan VCI Template
67 pages
L3 - Grep ND Egrep
No ratings yet
L3 - Grep ND Egrep
26 pages
C++ Programming
No ratings yet
C++ Programming
7 pages
Internet W TP 0-20-21 22 (En) Prezent Trzewik Kepkowicz
No ratings yet
Internet W TP 0-20-21 22 (En) Prezent Trzewik Kepkowicz
11 pages
Unix - Commands
No ratings yet
Unix - Commands
24 pages
Linux Commands
No ratings yet
Linux Commands
11 pages
V Yd JGD WMD8 ROg SJS
No ratings yet
V Yd JGD WMD8 ROg SJS
15 pages
Panorama_ Managing Firewalls at Scale (EDU-220) - Assessment2
No ratings yet
Panorama_ Managing Firewalls at Scale (EDU-220) - Assessment2
5 pages
L0 - File Permissions and Simple Examp
No ratings yet
L0 - File Permissions and Simple Examp
10 pages
Domicile Certificate
No ratings yet
Domicile Certificate
1 page
Gauri Unix Linux
No ratings yet
Gauri Unix Linux
13 pages
Basic Unix Commands
No ratings yet
Basic Unix Commands
10 pages
What Is Meant by Utility Program
No ratings yet
What Is Meant by Utility Program
6 pages
C - 12 X 3.PGM - Plasma Programming Example
No ratings yet
C - 12 X 3.PGM - Plasma Programming Example
14 pages
Basics of Shell Scripting
No ratings yet
Basics of Shell Scripting
9 pages
Abundance: TIC: PRODUK 1.d/data - Ms
No ratings yet
Abundance: TIC: PRODUK 1.d/data - Ms
3 pages
ACEBA Systems Technology Institute Inc., Gumaca, Quezon
No ratings yet
ACEBA Systems Technology Institute Inc., Gumaca, Quezon
9 pages
Experiment No:5 Date: 7/1/2022 Internet - Web Browsers, Search Engines and Email
No ratings yet
Experiment No:5 Date: 7/1/2022 Internet - Web Browsers, Search Engines and Email
3 pages
UNIX Shell Scripting: Y.V.S Prasad
No ratings yet
UNIX Shell Scripting: Y.V.S Prasad
114 pages
Procedure Text Exercises
No ratings yet
Procedure Text Exercises
3 pages
Sed - Awk
No ratings yet
Sed - Awk
7 pages
Unix Programs
No ratings yet
Unix Programs
33 pages
Dav PreBoard - 12 - 2018
No ratings yet
Dav PreBoard - 12 - 2018
5 pages
Iot Car Code
No ratings yet
Iot Car Code
3 pages
Assignment Nptel
No ratings yet
Assignment Nptel
5 pages
A Reinforcement Learning Approach To Obstacle Avoidance of Mobil
No ratings yet
A Reinforcement Learning Approach To Obstacle Avoidance of Mobil
5 pages
Linear Programming Notes
No ratings yet
Linear Programming Notes
169 pages
Vibrotech Instruments Catalogue
No ratings yet
Vibrotech Instruments Catalogue
1 page
Using Grep, TR and Sed With Regular Expressions
No ratings yet
Using Grep, TR and Sed With Regular Expressions
7 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet

L5 - Reg Exp

Uploaded by

L5 - Reg Exp

Uploaded by

Regular expressions

 Used by several different UNIX commands,

 A dollar sign ‘$’ is used to match the end of the

 (*) Asterisk matches zero or more characters

 [-0-9] matches a single dash or digit character

 ^\(.\).*\1$ matches all lines in which the first

 ^\(…)\(…\) the first three characters on the line

$ cut -f1 phonebook

 The option –s tells paste to paste together

$ ls | paste -d ' ' -s -

students test tp twice user

 Example: sed –n ‘/1,2d/’ file, deletes lines 1

showing non printing characters as \nn and

Character Octal value

 Example: tr –s ’ ’ ‘ ‘ < file will squeeze multiple

 Searches one or more files for a particular

 Example : grep * smarts will give an

 The –d option is used to list duplicate lines

You might also like