SlideShare a Scribd company logo
PYTHON APPLICATION
PROGRAMMING -18EC646
MODULE-3
REGULAR EXPRESSIONS
PROF. KRISHNANANDA L
DEPARTMEN T OF ECE
GSKSJTI, BENGALURU
WHAT IS MEANT BY
REGULAR EXPRESSION?
We have seen string/file slicing, searching, parsing etc and
built-in methods like split, find etc.
This task of searching and extracting finds applications in
Email classification, Web searching etc.
Python has a very powerful library called regularexpressions
that handles many of these tasks quite elegantly
Regular expressions are like small but powerful programming
language, for matching text patterns and provide a
standardized way of searching, replacing, and parsing text
with complex patterns of characters.
Regular expressions can be defined as the sequence of
characters which are used to search for a pattern in a string.
2
FEATURES OF REGEX
Hundreds of lines of code could be reduced to few lines with regular
expressions
Used to construct compilers, interpreters and text editors
Used to search and match text patterns
The power of the regular expressions comes when we add special
characters to the search string that allow us to do sophisticated
matching and extraction with very little code.
Used to validate text data formats especially input data
ARegular Expression (or Regex) is a pattern (or filter) that describes
a set of strings that matches the pattern. A regex consists of a
sequence of characters, metacharacters (such as . , d , ?, W etc ) and
operators (such as + , * , ? , | , ^ ).
Popular programming languages like Python, Perl, JavaScript, Ruby,
Tcl, C# etc have Regex capabilities 3
GENERAL USES OF REGULAR
EXPRESSIONS
Search a string (search and match)
Replace parts of a string(sub)
Break string into small pieces(split)
Finding a string (findall)
The module re provides the support to use regex in the
python program. The re module throws an exception if there
is some error while using the regular expression.
Before using the regular expressions in program, we have to
import the library using “import re”
4
REGEX FUNCTIONS
The re module offers a set of functions
FUNCTION DESCRIPTION
findall Returns a list containing all matches of a pattern in
the string
search Returns a match Object if there is a match
anywhere in the string
split Returns a list where the string has been split at each
match
sub Replaces one or more matches in a string
(substitute with another string)
match This method matches the regex pattern in the string
with the optional flag. It returns true if a match is
found in the string, otherwise it returns false.
5
EXAMPLE PROGRAM
• We open the file, loop through
each line, and use the regular
expression search() to only print
out lines that contain the string
“hello”. (same can be done using
“line.find()” also)
# Search for lines that contain ‘hello'
import re
fp = open('d:/18ec646/demo1.txt')
for line in fp:
line = line.rstrip()
if re.search('hello', line):
print(line)
Output:
hello and welcome to python class
hello how are you?
# Search for lines that contain ‘hello'
import re
fp = open('d:/18ec646/demo2.txt')
for line in fp:
line = line.rstrip()
if re.search('hello', line):
print(line)
Output:
friends,hello and welcome
hello,goodmorning 6
EXAMPLE PROGRAM
• To get the optimum performance from Regex, we need to use special
characters called ‘metacharacters’
# Search for lines that starts with 'hello'
import re
fp = open('d:/18ec646/demo1.txt')
for line in fp:
line = line.rstrip()
if re.search('^hello', line): ## note 'caret' metacharacter
print(line) ## before hello
Output:
hello and welcome to python class
hello how are you?
# Search for lines that starts with 'hello'
import re
fp = open('d:/18ec646/demo2.txt')
for line in fp:
line = line.rstrip()
if re.search('^hello', line): ## note 'caret' metacharacter
print(line) ## before hello
Output:
hello, goodmorning
7
METACHARACTERS
Metacharacters are characters that are interpreted in a
special way by a RegEx engine.
Metacharacters are very helpful for parsing/extraction
from the given file/string
Metacharacters allow us to build more powerful regular
expressions.
Table-1 provides a summary of metacharacters and their
meaning in RegEx
Here's a list of metacharacters:
[ ] . ^ $ * + ? { } ( )  |
8
Metacharacter Description Example
[ ] It represents the set of characters. "[a-z]"
 It represents the special sequence (can also be
used to escape special characters)
"r"
. It signals that any character is present at some
specific place (except newline character)
"Ja...v."
^ It represents the pattern present at the beginning
of the string (indicates “startswith”)
"^python"
$ It represents the pattern present at the end of the
string. (indicates “endswith”)
"world"
* It represents zero or more occurrences of a
pattern in the string.
"hello*"
+ It represents one or more occurrences of a
pattern in the string.
"hello+"
{} The specified number of occurrences of a pattern
the string.
“hello{2}"
| It represents either this or the other character is
present.
"hello|hi"
() Capture and group
9
[ ] - SQUARE BRACKETS
• Square brackets specifies a set of characters you wish to match.
• A set is a group of characters given inside a pair of square brackets. It represents
the special meaning.
10
[abc] Returns a match if the string contains any of the specified
characters in the set.
[a-n] Returns a match if the string contains any of the characters between a to
n.
[^arn] Returns a match if the string contains the characters except a, r, and n.
[0123] Returns a match if the string contains any of the specified digits.
[0-9] Returns a match if the string contains any digit between 0 and 9.
[0-5][0-9] Returns a match if the string contains any digit between 00 and 59.
[a-zA-Z] Returns a match if the string contains any alphabet (lower-case or upper-
case).
CONTD..
### illustrating square brackets
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("[w]", line):
print(line)
## search all the lines where w is
present and display
Output:
Hello and welcome
@abhishek,how are you
### illustrating square brackets
import re
fh = open('d:/18ec646/demo3.txt')
for line in fh:
line = line.rstrip()
if re.search("[ge]", line):
print(line)
### Search for characters g or e or
both and display
Output:
Hello and welcome
This is Bangalore
11
CONTD…
### illustrating square brackets
import re
fh = open('d:/18ec646/demo3.txt')
for line in fh:
line = line.rstrip()
if re.search("[th]", line):
print(line)
Ouput:
This is Bangalore
This is Paris
This is London
import re
fh = open('d:/18ec646/demo7.txt')
for line in fh:
line = line.rstrip()
if re.search("[y]", line):
print(line) Ouput:
johny johny yes papa
open your mouth
### illustratingsquare brackets
import re
fh =
open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("[x-z]", line):
print(line)
Output:
to:abhishek@yahoo.com
@abhishek,how are you
12
. PERIOD (DOT)
A period matches any single character (except newline 'n‘)
Expression String Matched?
..
(any two
characters)
a No match
ac 1 match
acd 1 match
acde
2 matches
(contains 4
characters)
### illustrating dot metacharacter
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("y.", line):
print(line)
Output:
to: abhishek@yahoo.com
@abhishek,how are you
13
CONTD..
### illustrating dot metacharacter
import re
fh = open('d:/18ec646/demo3.txt')
for line in fh:
line = line.rstrip()
if re.search("P.", line):
print(line)
Output:
This is Paris
### illustrating dot metacharacter
import re
fh = open('d:/18ec646/demo6.txt')
for line in fh:
line = line.rstrip()
if re.search("T..s", line):
print(line)
Output:
This is London
These are beautiful flowers
Thus we see the great London bridge
### illustrating dot metacharacter
import re
fh = open('d:/18ec646/demo6.txt')
for line in fh:
line = line.rstrip()
if re.search("L..d", line):
print(line)
Output:
This is London
Thus we see the great London bridge
## any two characters betweenT and s
14
^ - CARET
The caret symbol ^ is used to check if a string starts with a certain
character
Expression String Matched?
^a
a 1 match
abc 1 match
bac No match
^ab
abc 1 match
acb No match (starts with a but not followedby b)
### illustrating caret
import re
fh = open('d:/18ec646/demo2.txt')
for line in fh:
line = line.rstrip()
if re.search("^h",line):
print(line) Output:
hello, goodmorning
### illustrating caret
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("^f", line):
print(line)
from:krishna.sksj@gmail.com
15
$ - DOLLAR
The dollar symbol $ is used to check if a string ends with a certain
character.
Expression String Matched?
a$
a 1 match
formula 1 match
cab No match
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("m$", line):
print(line)
Output:
from:krishna.sksj@gmail.com
to: abhishek@yahoo.com
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo7.txt')
for line in fh:
line = line.rstrip()
if re.search("papa$", line):
print(line)
Output:
johny johny yes papa
eating sugar no papa
16
* - STAR
The star symbol * matches zero or more occurrences of the pattern left
to it.
Expression String Matched?
ma*n
mn 1 match
man 1 match
maaan 1 match
main No match (a is not followedby n)
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo6.txt')
for line in fh:
line = line.rstrip()
if re.search("London*",line):
print(line)
Output:
This is London
Thus we see the great London bridge
17
+ - PLUS
The plus symbol + matchesone or more occurrences of the pattern left
to it.
Expression String Matched?
ma+n
mn No match (no a character)
man 1 match
maaan 1 match
main No match (a is not followedby n)
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo6.txt')
for line in fh:
line = line.rstrip()
if re.search("see+", line):
print(line)
Output:
Thus we see the great London bridge
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo6.txt')
for line in fh:
line = line.rstrip()
if re.search("ar+", line):
print(line)
Output:
These are beautiful flowers
18
? - QUESTION MARK
The question mark symbol ? matches zero or one occurrence of the pattern left to
it.
Expression String Matched?
ma?n
mn 1 match
man 1 match
maaan No match (more than one a character)
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("@gmail?", line):
print(line)
Output:
from:krishna.sksj@gmail.com
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("you?",line):
print(line)
Output:
@abhishek,how are you
19
{} - BRACES
Finds the specified number of occurrences of a pattern. Consider {n, m}. This
means at least n, and at most m repetitions of the pattern left to it.
If a{2} was given, a should be repeated exactly twice
Expression String Matched?
a{2,3}
abc dat No match
abc daat 1 match (at daat)
aabc daaat 2 matches (at aabc and daaat)
aabc daaaat 2 matches (at aabc and daaaat)
20
| - ALTERNATION
Vertical bar | is used for alternation (or operator).
Expression String Matched?
a|b
cde No match
ade 1 match (match at ade)
acdbea 3 matches (at acdbea)
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo7.txt')
for line in fh:
line = line.rstrip()
if re.search("yes|no", line):
print(line)
Output:
johny johny yes papa
eating sugar no papa
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo2.txt')
for line in fh:
line = line.rstrip()
if re.search("hello|how", line):
print(line)
Output:
friends,hello and welcome
hello,goodmorning
21
() - GROUP
Parentheses () is used to group sub-patterns.
For ex, (a|b|c)xz match any string that matches
either a or b or c followed by xz
Expression String Matched?
(a|b|c)xz
ab xz No match
abxz 1 match (match at abxz)
axz cabxz 2 matches (at axzbc cabxz)
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo5.txt')
for line in fh:
line = line.rstrip()
if re.search("(hello|how) are", line):
print(line)
Output:@abhishek,how are you
### illustrating metacharacters
import re
fh = open('d:/18ec646/demo2.txt')
for line in fh:
line = line.rstrip()
if re.search("(hello and)", line):
print(line)
Ouptut:
friends,hello and welcome
22
- BACKSLASH
Backlash  is used to escape various characters including all
metacharacters.
For ex, $a match if a string contains $ followed by a.
Here, $ is not interpreted by a RegEx engine in a special way.
If you are unsure if a character has special meaning or not, you
can put  in front of it. This makes sure the character is not treated
in a special way.
NOTE :- Another way of doing it is putting the special
character in the square brackets [ ]
23
SPECIAL SEQUENCES
A special sequence is a  followed by one of the characters
(see Table) and has a special meaning
Special sequences make commonly used patterns easier to
write.
24
SPECIAL SEQUENCES
Character Description Example
A It returns a match if the specified characters are
present at the beginning of the string.
"AThe"
b It returns a match if the specified characters are
present at the beginning or the end of the string.
r"bain"
r"ainb"
B It returns a match if the specified characters are
present at the beginning of the string but not at the
end.
r"Bain"
r"ainB
d It returns a match if the string contains digits [0-9]. "d"
D It returns a match if the string doesn't contain the
digits [0-9].
"D"
s It returns a match if the string contains any white
space character.
"s"
S It returns a match if the string doesn't contain any
white space character.
"S"
w It returns a match if the string contains any word
characters (Ato Z, a to z, 0 to 9 and underscore)
"w"
W It returns a match if the string doesn't contain any
word characters
"W" 25
A - Matches if the specified characters are at the start of a string.
Expression String Matched?
Athe
the sun Match
In the sun No match
26
b - Matches if the specified characters are at the beginning or end of a word
Expression String Matched?
bfoo
football Match
a football Match
afootball No match
foob
football No Match
the afoo test Match
the afootest No match
B - Opposite of b. Matches if the specified characters
are not at the beginning or end of a word.
Expression String Matched?
Bfoo
football No match
a football No match
afootball Match
fooB
the foo No match
the afoo test No match
the afootest Match
27
d - Matches any decimal digit. Equivalent to [0-9]
D - Matches any non-decimal digit. Equivalent to [^0-9]
Expression String Matched?
d
12abc3 3 matches (at 12abc3)
Python No match
Expression String Matched?
D
1ab34"50 3 matches (at 1ab34"50)
1345 No match
28
s - Matches where a string contains any whitespace
character. Equivalent to [ tnrfv].
S - Matches where a string contains any non-whitespace
character. Equivalent to [^ tnrfv].
Expression String Matched?
s
Python RegEx 1 match
PythonRegEx No match
Expression String Matched?
S
a b 2 matches (at a b)
No match
29
w - Matches any alphanumeric character. Equivalent to [a-zA-Z0-
9_]. Underscore is also considered an alphanumeric character
W - Matches any non-alphanumeric character. Equivalent
to [^a-zA-Z0-9_]
Expression String Matched?
w
12&":;c 3 matches (at 12&":;c)
%"> ! No match
Expression String Matched?
W
1a2%c 1 match (at 1a2%c)
Python No match
30
Z - Matches if the specified characters are at the end of a
string.
Expression String Matched?
PythonZ
I like Python 1 match
I like Python
Programming
No match
Python is fun. No match
31
# check whether the specified
#characters are at the end of string
import re
fp = open('d:/18ec646/demo5.txt')
for x in fp:
x = x.rstrip()
if re.findall ("comZ", x):
print(x)
Output:
from:krishna.sksj@gmail.com
to: abhishek@yahoo.com
REGEX FUNCTIONS
The re module offers a set of functions
FUNCTION DESCRIPTION
findall Returns a list containing all matches of a pattern in
the string
search Returns a match Object if there is a match
anywhere in the string
split Returns a list where the string has been split at each
match
sub Replaces one or more matches in a string
(substitute with another string)
match This method matches the regex pattern in the string
with the optional flag. It returns true if a match is
found in the string, otherwise it returns false.
32
THE FINDALL() FUNCTION
The findall() function returns a list containing all matches.
The list contains the matches in the order they are found.
If no matches are found, an empty list is returned
Here is the syntax for this function −
re. findall(pattern, string, flags=0)
33
import re
str ="How are you. How is everything?"
matches= re.findall("How",str)
print(matches)
['How','How']
EXAMPLES Contd..
OUTPUTS:
34
CONTD..
35
#check whether string starts with How
import re
str ="How are you. How is everything?"
x= re.findall("^How",str)
print (str)
print(x)
if x:
print ("string starts with 'How' ")
else:
print ("string does not start with 'How'")
Output:
How are you.How is everything?
['How']
string starts with 'How'
CONTD…
36
# match all lines that starts with 'hello'
import re
fp = open('d:/18ec646/demo1.txt')
for x in fp:
x = x.rstrip()
if re.findall ('^hello',x): ## note 'caret'
print(x)
Output:
hello and welcome to python class
hello how are you?
# match all lines that starts with ‘@'
import re
fp = open('d:/18ec646/demo5.txt')
for x in fp:
x = x.rstrip()
if re.findall ('^@',x): ## note 'caret'
metacharacter
print(x)
Output:
@abhishek,how are you
# check whether the string contains
## non-digit characters
import re
fp = open('d:/18ec646/demo5.txt')
for x in fp:
x = x.rstrip()
if re.findall ("D", x): ## special sequence
print(x)
from:krishna.sksj@gmail.com
to:abhishek@yahoo.com
Hello and welcome
@abhishek,how are you
THE SEARCH() FUNCTION
The search() function searches the string for a match, and
returns a Match object if there is a match.
If there is more than one match, only the first occurrence
of the match will be returned
If no matches are found, the value None is returned
Here is the syntax for this function −
re.search(pattern, string, flags=0)
37
EXAPLES on search() function:-
outputs:
38
THE SPLIT() FUNCTION
The re.split method splits the string where there is a match
and returns a list of strings where the splits have occurred.
You can pass maxsplit argument to the re.split() method. It's
the maximum number of splits that will occur.
If the pattern is not found, re.split() returns a list containing
the original string.
Here is the syntax for this function −
re.split(pattern, string, maxsplit=0, flags=0)
39
EXAPLES on split() function:-
40
# split function
import re
fp = open('d:/18ec646/demo5.txt')
for x in fp:
x = x.rstrip()
x= re.split("@",x)
print(x)
Output:
['from:krishna.sksj','gmail.com']
['to: abhishek','yahoo.com']
['Hello and welcome']
['','abhishek,how are you']
CONTD..
41
# split function
import re
fp =
open('d:/18ec646/demo7.txt')
for x in fp:
x = x.rstrip()
x= re.split("e",x)
print(x)
Output:
['johny johny y','s papa']
['', 'ating sugar no papa']
['t','lling li', 's']
['op','n your mouth']
Output:
['johny johny yes ', '']
['eating sugar no ','']
['telling lies']
['open your mouth']
# split function
import re
fp =
open('d:/18ec646/demo7.txt')
for x in fp:
x = x.rstrip()
x= re.split("papa",x)
print(x)
# split function
import re
fp =
open('d:/18ec646/demo3.txt')
for x in fp:
x = x.rstrip()
x= re.split("is",x)
print(x)
Output:
['Hello and welcome']
['Th',' ',' Bangalore']
['Th',' ',' Par','']
['Th',' ',' London']
THE SUB() FUNCTION
The sub() function replaces the matches with the text of your
choice
You can control the number of replacements by specifying
the count parameter
If the pattern is not found, re.sub() returns the original string
Here is the syntax for this function −
re.sub(pattern, repl, string, count=0, flags=0)
42
EXAPLES on sub() function:-
43
### illustration of substitute (replace)
import re
str ="How are you.How is everything?"
x= re.sub("How","where",str)
print(x)
Output:
where are you.where is everything?
# sub function
import re
fp = open('d:/18ec646/demo3.txt')
for x in fp:
x = x.rstrip()
x= re.sub("This","Where",x)
print(x)
Output:
Hello and welcome
Where is Bangalore
Where is Paris
Where is London
THE MATCH() FUNCTION
If zero or more characters at the beginning of string match
this regular expression, return a corresponding match object.
Return None if the string does not match the pattern.
Here is the syntax for this function −
Pattern.match(string[, pos[, endpos]])
The optional pos and endpos parameters have the same
meaning as for the search() method.
44
search() Vs match()
Python offers two different primitive operations based on
regular expressions:
 re.match() checksfor a match only at the beginning of the string,
while re.search() checks for a match anywhere in the string
Eg:-
45
# match function
import re
fp = open('d:/18ec646/demo3.txt')
for x in fp:
x = x.rstrip()
if re.match("This",x):
print(x)
Outptut:
This is Bangalore
This is Paris
This is London
MATCH OBJECT
A Match Object is an object containing information about the
search and the result
If there is no match, the value None will be returned, instead
of the Match Object
Some of the commonly used methods and attributes of match
objects are:
match.group(), match.start(), match.end(), match.span(),
match.string
46
match.group()
The group() method returns the part of the string where
there is a match
match.start(), match.end()
The start() function returns the index of the start of the
matched substring.
 Similarly, end() returns the end index of the matched
substring.
match.string
string attribute returns the passed string.
47
match.span()
The span() function returns a tuple containing start
and end index of the matched part.
Eg:-
OUTPUT:
(12,17)
48
Ad

More Related Content

What's hot (20)

Method overloading
Method overloadingMethod overloading
Method overloading
Lovely Professional University
 
Java static keyword
Java static keywordJava static keyword
Java static keyword
Lovely Professional University
 
Decision making and branching
Decision making and branchingDecision making and branching
Decision making and branching
Saranya saran
 
Python programming : Standard Input and Output
Python programming : Standard Input and OutputPython programming : Standard Input and Output
Python programming : Standard Input and Output
Emertxe Information Technologies Pvt Ltd
 
Strings in Java
Strings in JavaStrings in Java
Strings in Java
Abhilash Nair
 
Inner classes in java
Inner classes in javaInner classes in java
Inner classes in java
PhD Research Scholar
 
Database connectivity in python
Database connectivity in pythonDatabase connectivity in python
Database connectivity in python
baabtra.com - No. 1 supplier of quality freshers
 
Constants in java
Constants in javaConstants in java
Constants in java
Manojkumar C
 
Methods in java
Methods in javaMethods in java
Methods in java
chauhankapil
 
C++: Constructor, Copy Constructor and Assignment operator
C++: Constructor, Copy Constructor and Assignment operatorC++: Constructor, Copy Constructor and Assignment operator
C++: Constructor, Copy Constructor and Assignment operator
Jussi Pohjolainen
 
Classes, objects in JAVA
Classes, objects in JAVAClasses, objects in JAVA
Classes, objects in JAVA
Abhilash Nair
 
Python Basics
Python BasicsPython Basics
Python Basics
tusharpanda88
 
PHP - DataType,Variable,Constant,Operators,Array,Include and require
PHP - DataType,Variable,Constant,Operators,Array,Include and requirePHP - DataType,Variable,Constant,Operators,Array,Include and require
PHP - DataType,Variable,Constant,Operators,Array,Include and require
TheCreativedev Blog
 
Constructor ppt
Constructor pptConstructor ppt
Constructor ppt
Vinod Kumar
 
Python functions
Python functionsPython functions
Python functions
Aliyamanasa
 
this keyword in Java.pptx
this keyword in Java.pptxthis keyword in Java.pptx
this keyword in Java.pptx
ParvizMirzayev2
 
Exception Handling in object oriented programming using C++
Exception Handling in object oriented programming using C++Exception Handling in object oriented programming using C++
Exception Handling in object oriented programming using C++
Janki Shah
 
Call by value
Call by valueCall by value
Call by value
Dharani G
 
Templates
TemplatesTemplates
Templates
Pranali Chaudhari
 
Type casting in java
Type casting in javaType casting in java
Type casting in java
Farooq Baloch
 

Similar to Python regular expressions (20)

Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
Sujith Kumar
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
DurgaNayak4
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
Shree M.L.Kakadiya MCA mahila college, Amreli
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
Max Kleiner
 
Python - Regular Expressions
Python - Regular ExpressionsPython - Regular Expressions
Python - Regular Expressions
Mukesh Tekwani
 
Strings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perlStrings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perl
sana mateen
 
Unit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressionsUnit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressions
sana mateen
 
PHP Web Programming
PHP Web ProgrammingPHP Web Programming
PHP Web Programming
Muthuselvam RS
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular  Expressions,  Dictionaries.pdfModule 3 - Regular  Expressions,  Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdf
GaneshRaghu4
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdf
DarellMuchoko
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressions
mussawir20
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
Raj Gupta
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
Logan Palanisamy
 
Processing Regex Python
Processing Regex PythonProcessing Regex Python
Processing Regex Python
primeteacher32
 
Introduction to perl scripting______.ppt
Introduction to perl scripting______.pptIntroduction to perl scripting______.ppt
Introduction to perl scripting______.ppt
nalinisamineni
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
Mahzad Zahedi
 
Unit 1-array,lists and hashes
Unit 1-array,lists and hashesUnit 1-array,lists and hashes
Unit 1-array,lists and hashes
sana mateen
 
FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...
FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...
FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...
jaychoudhary37
 
Perl Basics with Examples
Perl Basics with ExamplesPerl Basics with Examples
Perl Basics with Examples
Nithin Kumar Singani
 
Chapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular ExpressionChapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular Expression
azzamhadeel89
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
Sujith Kumar
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
DurgaNayak4
 
Maxbox starter20
Maxbox starter20Maxbox starter20
Maxbox starter20
Max Kleiner
 
Python - Regular Expressions
Python - Regular ExpressionsPython - Regular Expressions
Python - Regular Expressions
Mukesh Tekwani
 
Strings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perlStrings,patterns and regular expressions in perl
Strings,patterns and regular expressions in perl
sana mateen
 
Unit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressionsUnit 1-strings,patterns and regular expressions
Unit 1-strings,patterns and regular expressions
sana mateen
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular  Expressions,  Dictionaries.pdfModule 3 - Regular  Expressions,  Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdf
GaneshRaghu4
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdf
DarellMuchoko
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressions
mussawir20
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
Raj Gupta
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
Logan Palanisamy
 
Processing Regex Python
Processing Regex PythonProcessing Regex Python
Processing Regex Python
primeteacher32
 
Introduction to perl scripting______.ppt
Introduction to perl scripting______.pptIntroduction to perl scripting______.ppt
Introduction to perl scripting______.ppt
nalinisamineni
 
Unit 1-array,lists and hashes
Unit 1-array,lists and hashesUnit 1-array,lists and hashes
Unit 1-array,lists and hashes
sana mateen
 
FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...
FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...
FAL(2022-23)_FRESHERS_CSE1012_ETH_AP2022234000166_Reference_Material_I_06-Dec...
jaychoudhary37
 
Chapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular ExpressionChapter 3: Introduction to Regular Expression
Chapter 3: Introduction to Regular Expression
azzamhadeel89
 
Ad

More from Krishna Nanda (16)

Python dictionaries
Python dictionariesPython dictionaries
Python dictionaries
Krishna Nanda
 
Python lists
Python listsPython lists
Python lists
Krishna Nanda
 
Python-Tuples
Python-TuplesPython-Tuples
Python-Tuples
Krishna Nanda
 
Python- strings
Python- stringsPython- strings
Python- strings
Krishna Nanda
 
Python-files
Python-filesPython-files
Python-files
Krishna Nanda
 
Computer Communication Networks- Introduction to Transport layer
Computer Communication Networks- Introduction to Transport layerComputer Communication Networks- Introduction to Transport layer
Computer Communication Networks- Introduction to Transport layer
Krishna Nanda
 
Computer Communication Networks- TRANSPORT LAYER PROTOCOLS
Computer Communication Networks- TRANSPORT LAYER PROTOCOLSComputer Communication Networks- TRANSPORT LAYER PROTOCOLS
Computer Communication Networks- TRANSPORT LAYER PROTOCOLS
Krishna Nanda
 
COMPUTER COMMUNICATION NETWORKS -IPv4
COMPUTER COMMUNICATION NETWORKS -IPv4COMPUTER COMMUNICATION NETWORKS -IPv4
COMPUTER COMMUNICATION NETWORKS -IPv4
Krishna Nanda
 
COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2
COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2
COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2
Krishna Nanda
 
Computer Communication Networks-Routing protocols 1
Computer Communication Networks-Routing protocols 1Computer Communication Networks-Routing protocols 1
Computer Communication Networks-Routing protocols 1
Krishna Nanda
 
Computer Communication Networks-Wireless LAN
Computer Communication Networks-Wireless LANComputer Communication Networks-Wireless LAN
Computer Communication Networks-Wireless LAN
Krishna Nanda
 
Computer Communication Networks-Network Layer
Computer Communication Networks-Network LayerComputer Communication Networks-Network Layer
Computer Communication Networks-Network Layer
Krishna Nanda
 
Lk module3
Lk module3Lk module3
Lk module3
Krishna Nanda
 
Lk module4 structures
Lk module4 structuresLk module4 structures
Lk module4 structures
Krishna Nanda
 
Lk module4 file
Lk module4 fileLk module4 file
Lk module4 file
Krishna Nanda
 
Lk module5 pointers
Lk module5 pointersLk module5 pointers
Lk module5 pointers
Krishna Nanda
 
Computer Communication Networks- Introduction to Transport layer
Computer Communication Networks- Introduction to Transport layerComputer Communication Networks- Introduction to Transport layer
Computer Communication Networks- Introduction to Transport layer
Krishna Nanda
 
Computer Communication Networks- TRANSPORT LAYER PROTOCOLS
Computer Communication Networks- TRANSPORT LAYER PROTOCOLSComputer Communication Networks- TRANSPORT LAYER PROTOCOLS
Computer Communication Networks- TRANSPORT LAYER PROTOCOLS
Krishna Nanda
 
COMPUTER COMMUNICATION NETWORKS -IPv4
COMPUTER COMMUNICATION NETWORKS -IPv4COMPUTER COMMUNICATION NETWORKS -IPv4
COMPUTER COMMUNICATION NETWORKS -IPv4
Krishna Nanda
 
COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2
COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2
COMPUTER COMMUNICATION NETWORKS-R-Routing protocols 2
Krishna Nanda
 
Computer Communication Networks-Routing protocols 1
Computer Communication Networks-Routing protocols 1Computer Communication Networks-Routing protocols 1
Computer Communication Networks-Routing protocols 1
Krishna Nanda
 
Computer Communication Networks-Wireless LAN
Computer Communication Networks-Wireless LANComputer Communication Networks-Wireless LAN
Computer Communication Networks-Wireless LAN
Krishna Nanda
 
Computer Communication Networks-Network Layer
Computer Communication Networks-Network LayerComputer Communication Networks-Network Layer
Computer Communication Networks-Network Layer
Krishna Nanda
 
Lk module4 structures
Lk module4 structuresLk module4 structures
Lk module4 structures
Krishna Nanda
 
Ad

Recently uploaded (20)

Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Development of MLR, ANN and ANFIS Models for Estimation of PCUs at Different ...
Journal of Soft Computing in Civil Engineering
 
Smart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineeringSmart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineering
rushikeshnavghare94
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
Introduction to FLUID MECHANICS & KINEMATICS
Introduction to FLUID MECHANICS &  KINEMATICSIntroduction to FLUID MECHANICS &  KINEMATICS
Introduction to FLUID MECHANICS & KINEMATICS
narayanaswamygdas
 
RICS Membership-(The Royal Institution of Chartered Surveyors).pdf
RICS Membership-(The Royal Institution of Chartered Surveyors).pdfRICS Membership-(The Royal Institution of Chartered Surveyors).pdf
RICS Membership-(The Royal Institution of Chartered Surveyors).pdf
MohamedAbdelkader115
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
theory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptxtheory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptx
sanchezvanessa7896
 
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
charlesdick1345
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Process Parameter Optimization for Minimizing Springback in Cold Drawing Proc...
Journal of Soft Computing in Civil Engineering
 
some basics electrical and electronics knowledge
some basics electrical and electronics knowledgesome basics electrical and electronics knowledge
some basics electrical and electronics knowledge
nguyentrungdo88
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
Raish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdfRaish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdf
RaishKhanji
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
Data Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptxData Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptx
RushaliDeshmukh2
 
Smart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptxSmart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptx
rushikeshnavghare94
 
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
Smart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineeringSmart Storage Solutions.pptx for production engineering
Smart Storage Solutions.pptx for production engineering
rushikeshnavghare94
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
Introduction to FLUID MECHANICS & KINEMATICS
Introduction to FLUID MECHANICS &  KINEMATICSIntroduction to FLUID MECHANICS &  KINEMATICS
Introduction to FLUID MECHANICS & KINEMATICS
narayanaswamygdas
 
RICS Membership-(The Royal Institution of Chartered Surveyors).pdf
RICS Membership-(The Royal Institution of Chartered Surveyors).pdfRICS Membership-(The Royal Institution of Chartered Surveyors).pdf
RICS Membership-(The Royal Institution of Chartered Surveyors).pdf
MohamedAbdelkader115
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
theory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptxtheory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptx
sanchezvanessa7896
 
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
DATA-DRIVEN SHOULDER INVERSE KINEMATICS YoungBeom Kim1 , Byung-Ha Park1 , Kwa...
charlesdick1345
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
some basics electrical and electronics knowledge
some basics electrical and electronics knowledgesome basics electrical and electronics knowledge
some basics electrical and electronics knowledge
nguyentrungdo88
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
Raish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdfRaish Khanji GTU 8th sem Internship Report.pdf
Raish Khanji GTU 8th sem Internship Report.pdf
RaishKhanji
 
Data Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptxData Structures_Introduction to algorithms.pptx
Data Structures_Introduction to algorithms.pptx
RushaliDeshmukh2
 
Data Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptxData Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptx
RushaliDeshmukh2
 
Smart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptxSmart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptx
rushikeshnavghare94
 

Python regular expressions

  • 1. PYTHON APPLICATION PROGRAMMING -18EC646 MODULE-3 REGULAR EXPRESSIONS PROF. KRISHNANANDA L DEPARTMEN T OF ECE GSKSJTI, BENGALURU
  • 2. WHAT IS MEANT BY REGULAR EXPRESSION? We have seen string/file slicing, searching, parsing etc and built-in methods like split, find etc. This task of searching and extracting finds applications in Email classification, Web searching etc. Python has a very powerful library called regularexpressions that handles many of these tasks quite elegantly Regular expressions are like small but powerful programming language, for matching text patterns and provide a standardized way of searching, replacing, and parsing text with complex patterns of characters. Regular expressions can be defined as the sequence of characters which are used to search for a pattern in a string. 2
  • 3. FEATURES OF REGEX Hundreds of lines of code could be reduced to few lines with regular expressions Used to construct compilers, interpreters and text editors Used to search and match text patterns The power of the regular expressions comes when we add special characters to the search string that allow us to do sophisticated matching and extraction with very little code. Used to validate text data formats especially input data ARegular Expression (or Regex) is a pattern (or filter) that describes a set of strings that matches the pattern. A regex consists of a sequence of characters, metacharacters (such as . , d , ?, W etc ) and operators (such as + , * , ? , | , ^ ). Popular programming languages like Python, Perl, JavaScript, Ruby, Tcl, C# etc have Regex capabilities 3
  • 4. GENERAL USES OF REGULAR EXPRESSIONS Search a string (search and match) Replace parts of a string(sub) Break string into small pieces(split) Finding a string (findall) The module re provides the support to use regex in the python program. The re module throws an exception if there is some error while using the regular expression. Before using the regular expressions in program, we have to import the library using “import re” 4
  • 5. REGEX FUNCTIONS The re module offers a set of functions FUNCTION DESCRIPTION findall Returns a list containing all matches of a pattern in the string search Returns a match Object if there is a match anywhere in the string split Returns a list where the string has been split at each match sub Replaces one or more matches in a string (substitute with another string) match This method matches the regex pattern in the string with the optional flag. It returns true if a match is found in the string, otherwise it returns false. 5
  • 6. EXAMPLE PROGRAM • We open the file, loop through each line, and use the regular expression search() to only print out lines that contain the string “hello”. (same can be done using “line.find()” also) # Search for lines that contain ‘hello' import re fp = open('d:/18ec646/demo1.txt') for line in fp: line = line.rstrip() if re.search('hello', line): print(line) Output: hello and welcome to python class hello how are you? # Search for lines that contain ‘hello' import re fp = open('d:/18ec646/demo2.txt') for line in fp: line = line.rstrip() if re.search('hello', line): print(line) Output: friends,hello and welcome hello,goodmorning 6
  • 7. EXAMPLE PROGRAM • To get the optimum performance from Regex, we need to use special characters called ‘metacharacters’ # Search for lines that starts with 'hello' import re fp = open('d:/18ec646/demo1.txt') for line in fp: line = line.rstrip() if re.search('^hello', line): ## note 'caret' metacharacter print(line) ## before hello Output: hello and welcome to python class hello how are you? # Search for lines that starts with 'hello' import re fp = open('d:/18ec646/demo2.txt') for line in fp: line = line.rstrip() if re.search('^hello', line): ## note 'caret' metacharacter print(line) ## before hello Output: hello, goodmorning 7
  • 8. METACHARACTERS Metacharacters are characters that are interpreted in a special way by a RegEx engine. Metacharacters are very helpful for parsing/extraction from the given file/string Metacharacters allow us to build more powerful regular expressions. Table-1 provides a summary of metacharacters and their meaning in RegEx Here's a list of metacharacters: [ ] . ^ $ * + ? { } ( ) | 8
  • 9. Metacharacter Description Example [ ] It represents the set of characters. "[a-z]" It represents the special sequence (can also be used to escape special characters) "r" . It signals that any character is present at some specific place (except newline character) "Ja...v." ^ It represents the pattern present at the beginning of the string (indicates “startswith”) "^python" $ It represents the pattern present at the end of the string. (indicates “endswith”) "world" * It represents zero or more occurrences of a pattern in the string. "hello*" + It represents one or more occurrences of a pattern in the string. "hello+" {} The specified number of occurrences of a pattern the string. “hello{2}" | It represents either this or the other character is present. "hello|hi" () Capture and group 9
  • 10. [ ] - SQUARE BRACKETS • Square brackets specifies a set of characters you wish to match. • A set is a group of characters given inside a pair of square brackets. It represents the special meaning. 10 [abc] Returns a match if the string contains any of the specified characters in the set. [a-n] Returns a match if the string contains any of the characters between a to n. [^arn] Returns a match if the string contains the characters except a, r, and n. [0123] Returns a match if the string contains any of the specified digits. [0-9] Returns a match if the string contains any digit between 0 and 9. [0-5][0-9] Returns a match if the string contains any digit between 00 and 59. [a-zA-Z] Returns a match if the string contains any alphabet (lower-case or upper- case).
  • 11. CONTD.. ### illustrating square brackets import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("[w]", line): print(line) ## search all the lines where w is present and display Output: Hello and welcome @abhishek,how are you ### illustrating square brackets import re fh = open('d:/18ec646/demo3.txt') for line in fh: line = line.rstrip() if re.search("[ge]", line): print(line) ### Search for characters g or e or both and display Output: Hello and welcome This is Bangalore 11
  • 12. CONTD… ### illustrating square brackets import re fh = open('d:/18ec646/demo3.txt') for line in fh: line = line.rstrip() if re.search("[th]", line): print(line) Ouput: This is Bangalore This is Paris This is London import re fh = open('d:/18ec646/demo7.txt') for line in fh: line = line.rstrip() if re.search("[y]", line): print(line) Ouput: johny johny yes papa open your mouth ### illustratingsquare brackets import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("[x-z]", line): print(line) Output: to:[email protected] @abhishek,how are you 12
  • 13. . PERIOD (DOT) A period matches any single character (except newline 'n‘) Expression String Matched? .. (any two characters) a No match ac 1 match acd 1 match acde 2 matches (contains 4 characters) ### illustrating dot metacharacter import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("y.", line): print(line) Output: to: [email protected] @abhishek,how are you 13
  • 14. CONTD.. ### illustrating dot metacharacter import re fh = open('d:/18ec646/demo3.txt') for line in fh: line = line.rstrip() if re.search("P.", line): print(line) Output: This is Paris ### illustrating dot metacharacter import re fh = open('d:/18ec646/demo6.txt') for line in fh: line = line.rstrip() if re.search("T..s", line): print(line) Output: This is London These are beautiful flowers Thus we see the great London bridge ### illustrating dot metacharacter import re fh = open('d:/18ec646/demo6.txt') for line in fh: line = line.rstrip() if re.search("L..d", line): print(line) Output: This is London Thus we see the great London bridge ## any two characters betweenT and s 14
  • 15. ^ - CARET The caret symbol ^ is used to check if a string starts with a certain character Expression String Matched? ^a a 1 match abc 1 match bac No match ^ab abc 1 match acb No match (starts with a but not followedby b) ### illustrating caret import re fh = open('d:/18ec646/demo2.txt') for line in fh: line = line.rstrip() if re.search("^h",line): print(line) Output: hello, goodmorning ### illustrating caret import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("^f", line): print(line) from:[email protected] 15
  • 16. $ - DOLLAR The dollar symbol $ is used to check if a string ends with a certain character. Expression String Matched? a$ a 1 match formula 1 match cab No match ### illustrating metacharacters import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("m$", line): print(line) Output: from:[email protected] to: [email protected] ### illustrating metacharacters import re fh = open('d:/18ec646/demo7.txt') for line in fh: line = line.rstrip() if re.search("papa$", line): print(line) Output: johny johny yes papa eating sugar no papa 16
  • 17. * - STAR The star symbol * matches zero or more occurrences of the pattern left to it. Expression String Matched? ma*n mn 1 match man 1 match maaan 1 match main No match (a is not followedby n) ### illustrating metacharacters import re fh = open('d:/18ec646/demo6.txt') for line in fh: line = line.rstrip() if re.search("London*",line): print(line) Output: This is London Thus we see the great London bridge 17
  • 18. + - PLUS The plus symbol + matchesone or more occurrences of the pattern left to it. Expression String Matched? ma+n mn No match (no a character) man 1 match maaan 1 match main No match (a is not followedby n) ### illustrating metacharacters import re fh = open('d:/18ec646/demo6.txt') for line in fh: line = line.rstrip() if re.search("see+", line): print(line) Output: Thus we see the great London bridge ### illustrating metacharacters import re fh = open('d:/18ec646/demo6.txt') for line in fh: line = line.rstrip() if re.search("ar+", line): print(line) Output: These are beautiful flowers 18
  • 19. ? - QUESTION MARK The question mark symbol ? matches zero or one occurrence of the pattern left to it. Expression String Matched? ma?n mn 1 match man 1 match maaan No match (more than one a character) ### illustrating metacharacters import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("@gmail?", line): print(line) Output: from:[email protected] ### illustrating metacharacters import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("you?",line): print(line) Output: @abhishek,how are you 19
  • 20. {} - BRACES Finds the specified number of occurrences of a pattern. Consider {n, m}. This means at least n, and at most m repetitions of the pattern left to it. If a{2} was given, a should be repeated exactly twice Expression String Matched? a{2,3} abc dat No match abc daat 1 match (at daat) aabc daaat 2 matches (at aabc and daaat) aabc daaaat 2 matches (at aabc and daaaat) 20
  • 21. | - ALTERNATION Vertical bar | is used for alternation (or operator). Expression String Matched? a|b cde No match ade 1 match (match at ade) acdbea 3 matches (at acdbea) ### illustrating metacharacters import re fh = open('d:/18ec646/demo7.txt') for line in fh: line = line.rstrip() if re.search("yes|no", line): print(line) Output: johny johny yes papa eating sugar no papa ### illustrating metacharacters import re fh = open('d:/18ec646/demo2.txt') for line in fh: line = line.rstrip() if re.search("hello|how", line): print(line) Output: friends,hello and welcome hello,goodmorning 21
  • 22. () - GROUP Parentheses () is used to group sub-patterns. For ex, (a|b|c)xz match any string that matches either a or b or c followed by xz Expression String Matched? (a|b|c)xz ab xz No match abxz 1 match (match at abxz) axz cabxz 2 matches (at axzbc cabxz) ### illustrating metacharacters import re fh = open('d:/18ec646/demo5.txt') for line in fh: line = line.rstrip() if re.search("(hello|how) are", line): print(line) Output:@abhishek,how are you ### illustrating metacharacters import re fh = open('d:/18ec646/demo2.txt') for line in fh: line = line.rstrip() if re.search("(hello and)", line): print(line) Ouptut: friends,hello and welcome 22
  • 23. - BACKSLASH Backlash is used to escape various characters including all metacharacters. For ex, $a match if a string contains $ followed by a. Here, $ is not interpreted by a RegEx engine in a special way. If you are unsure if a character has special meaning or not, you can put in front of it. This makes sure the character is not treated in a special way. NOTE :- Another way of doing it is putting the special character in the square brackets [ ] 23
  • 24. SPECIAL SEQUENCES A special sequence is a followed by one of the characters (see Table) and has a special meaning Special sequences make commonly used patterns easier to write. 24
  • 25. SPECIAL SEQUENCES Character Description Example A It returns a match if the specified characters are present at the beginning of the string. "AThe" b It returns a match if the specified characters are present at the beginning or the end of the string. r"bain" r"ainb" B It returns a match if the specified characters are present at the beginning of the string but not at the end. r"Bain" r"ainB d It returns a match if the string contains digits [0-9]. "d" D It returns a match if the string doesn't contain the digits [0-9]. "D" s It returns a match if the string contains any white space character. "s" S It returns a match if the string doesn't contain any white space character. "S" w It returns a match if the string contains any word characters (Ato Z, a to z, 0 to 9 and underscore) "w" W It returns a match if the string doesn't contain any word characters "W" 25
  • 26. A - Matches if the specified characters are at the start of a string. Expression String Matched? Athe the sun Match In the sun No match 26 b - Matches if the specified characters are at the beginning or end of a word Expression String Matched? bfoo football Match a football Match afootball No match foob football No Match the afoo test Match the afootest No match
  • 27. B - Opposite of b. Matches if the specified characters are not at the beginning or end of a word. Expression String Matched? Bfoo football No match a football No match afootball Match fooB the foo No match the afoo test No match the afootest Match 27
  • 28. d - Matches any decimal digit. Equivalent to [0-9] D - Matches any non-decimal digit. Equivalent to [^0-9] Expression String Matched? d 12abc3 3 matches (at 12abc3) Python No match Expression String Matched? D 1ab34"50 3 matches (at 1ab34"50) 1345 No match 28
  • 29. s - Matches where a string contains any whitespace character. Equivalent to [ tnrfv]. S - Matches where a string contains any non-whitespace character. Equivalent to [^ tnrfv]. Expression String Matched? s Python RegEx 1 match PythonRegEx No match Expression String Matched? S a b 2 matches (at a b) No match 29
  • 30. w - Matches any alphanumeric character. Equivalent to [a-zA-Z0- 9_]. Underscore is also considered an alphanumeric character W - Matches any non-alphanumeric character. Equivalent to [^a-zA-Z0-9_] Expression String Matched? w 12&":;c 3 matches (at 12&":;c) %"> ! No match Expression String Matched? W 1a2%c 1 match (at 1a2%c) Python No match 30
  • 31. Z - Matches if the specified characters are at the end of a string. Expression String Matched? PythonZ I like Python 1 match I like Python Programming No match Python is fun. No match 31 # check whether the specified #characters are at the end of string import re fp = open('d:/18ec646/demo5.txt') for x in fp: x = x.rstrip() if re.findall ("comZ", x): print(x) Output: from:[email protected] to: [email protected]
  • 32. REGEX FUNCTIONS The re module offers a set of functions FUNCTION DESCRIPTION findall Returns a list containing all matches of a pattern in the string search Returns a match Object if there is a match anywhere in the string split Returns a list where the string has been split at each match sub Replaces one or more matches in a string (substitute with another string) match This method matches the regex pattern in the string with the optional flag. It returns true if a match is found in the string, otherwise it returns false. 32
  • 33. THE FINDALL() FUNCTION The findall() function returns a list containing all matches. The list contains the matches in the order they are found. If no matches are found, an empty list is returned Here is the syntax for this function − re. findall(pattern, string, flags=0) 33 import re str ="How are you. How is everything?" matches= re.findall("How",str) print(matches) ['How','How']
  • 35. CONTD.. 35 #check whether string starts with How import re str ="How are you. How is everything?" x= re.findall("^How",str) print (str) print(x) if x: print ("string starts with 'How' ") else: print ("string does not start with 'How'") Output: How are you.How is everything? ['How'] string starts with 'How'
  • 36. CONTD… 36 # match all lines that starts with 'hello' import re fp = open('d:/18ec646/demo1.txt') for x in fp: x = x.rstrip() if re.findall ('^hello',x): ## note 'caret' print(x) Output: hello and welcome to python class hello how are you? # match all lines that starts with ‘@' import re fp = open('d:/18ec646/demo5.txt') for x in fp: x = x.rstrip() if re.findall ('^@',x): ## note 'caret' metacharacter print(x) Output: @abhishek,how are you # check whether the string contains ## non-digit characters import re fp = open('d:/18ec646/demo5.txt') for x in fp: x = x.rstrip() if re.findall ("D", x): ## special sequence print(x) from:[email protected] to:[email protected] Hello and welcome @abhishek,how are you
  • 37. THE SEARCH() FUNCTION The search() function searches the string for a match, and returns a Match object if there is a match. If there is more than one match, only the first occurrence of the match will be returned If no matches are found, the value None is returned Here is the syntax for this function − re.search(pattern, string, flags=0) 37
  • 38. EXAPLES on search() function:- outputs: 38
  • 39. THE SPLIT() FUNCTION The re.split method splits the string where there is a match and returns a list of strings where the splits have occurred. You can pass maxsplit argument to the re.split() method. It's the maximum number of splits that will occur. If the pattern is not found, re.split() returns a list containing the original string. Here is the syntax for this function − re.split(pattern, string, maxsplit=0, flags=0) 39
  • 40. EXAPLES on split() function:- 40 # split function import re fp = open('d:/18ec646/demo5.txt') for x in fp: x = x.rstrip() x= re.split("@",x) print(x) Output: ['from:krishna.sksj','gmail.com'] ['to: abhishek','yahoo.com'] ['Hello and welcome'] ['','abhishek,how are you']
  • 41. CONTD.. 41 # split function import re fp = open('d:/18ec646/demo7.txt') for x in fp: x = x.rstrip() x= re.split("e",x) print(x) Output: ['johny johny y','s papa'] ['', 'ating sugar no papa'] ['t','lling li', 's'] ['op','n your mouth'] Output: ['johny johny yes ', ''] ['eating sugar no ',''] ['telling lies'] ['open your mouth'] # split function import re fp = open('d:/18ec646/demo7.txt') for x in fp: x = x.rstrip() x= re.split("papa",x) print(x) # split function import re fp = open('d:/18ec646/demo3.txt') for x in fp: x = x.rstrip() x= re.split("is",x) print(x) Output: ['Hello and welcome'] ['Th',' ',' Bangalore'] ['Th',' ',' Par',''] ['Th',' ',' London']
  • 42. THE SUB() FUNCTION The sub() function replaces the matches with the text of your choice You can control the number of replacements by specifying the count parameter If the pattern is not found, re.sub() returns the original string Here is the syntax for this function − re.sub(pattern, repl, string, count=0, flags=0) 42
  • 43. EXAPLES on sub() function:- 43 ### illustration of substitute (replace) import re str ="How are you.How is everything?" x= re.sub("How","where",str) print(x) Output: where are you.where is everything? # sub function import re fp = open('d:/18ec646/demo3.txt') for x in fp: x = x.rstrip() x= re.sub("This","Where",x) print(x) Output: Hello and welcome Where is Bangalore Where is Paris Where is London
  • 44. THE MATCH() FUNCTION If zero or more characters at the beginning of string match this regular expression, return a corresponding match object. Return None if the string does not match the pattern. Here is the syntax for this function − Pattern.match(string[, pos[, endpos]]) The optional pos and endpos parameters have the same meaning as for the search() method. 44
  • 45. search() Vs match() Python offers two different primitive operations based on regular expressions:  re.match() checksfor a match only at the beginning of the string, while re.search() checks for a match anywhere in the string Eg:- 45 # match function import re fp = open('d:/18ec646/demo3.txt') for x in fp: x = x.rstrip() if re.match("This",x): print(x) Outptut: This is Bangalore This is Paris This is London
  • 46. MATCH OBJECT A Match Object is an object containing information about the search and the result If there is no match, the value None will be returned, instead of the Match Object Some of the commonly used methods and attributes of match objects are: match.group(), match.start(), match.end(), match.span(), match.string 46
  • 47. match.group() The group() method returns the part of the string where there is a match match.start(), match.end() The start() function returns the index of the start of the matched substring.  Similarly, end() returns the end index of the matched substring. match.string string attribute returns the passed string. 47
  • 48. match.span() The span() function returns a tuple containing start and end index of the matched part. Eg:- OUTPUT: (12,17) 48