SlideShare a Scribd company logo
Regular Expressions
A Regular Expression (RegEx) is a sequence of characters that defines a search pattern. For example,
^a...s$
The above code defines a RegEx pattern. The pattern is: any five letter string starting with a and
ending with s.
A pattern defined using RegEx can be used to match against a string.
Expression String Matched?
^a...s$
abs No match
alias Match
abyss Match
Alias No match
An abacus No match
Specify Pattern Using RegEx
To specify regular expressions, metacharacters are used. In the above example, ^ and $ are
metacharacters.
MetaCharacters
Metacharacters are characters that are interpreted in a special way by a RegEx engine. Here's a list of
metacharacters:
[] . ^ $ * + ? {} ()  |
[] - Square brackets
Square brackets specify a set of characters you wish to match.
Expression String Matched?
[abc]
a 1 match
ac 2 matches
Hey Jude No match
abc de ca 5 matches
Here, [abc] will match if the string you are trying to match contains any of the a, b or c.
You can also specify a range of characters using - inside square brackets.
• [a-e] is the same as [abcde].
• [1-4] is the same as [1234].
• [0-39] is the same as [01239].
You can complement (invert) the character set by using caret ^ symbol at the start of a square-
bracket.
• [^abc] means any character except a or b or c.
• [^0-9] means any non-digit character.
. - Period
A period matches any single character (except newline 'n').
Expression String Matched?
..
a No match
ac 1 match
acd 1 match
acde 2 matches (contains 4 characters)
^ - Caret
The caret symbol ^ is used to check if a string starts with a certain character.
Expression String Matched?
^a
a 1 match
abc 1 match
bac No match
^ab
abc 1 match
acb No match (starts with a but not followed by b)
$ - Dollar
The dollar symbol $ is used to check if a string ends with a certain character.
Expression String Matched?
a$
a 1 match
formula 1 match
cab No match
* - Star
The star symbol * matches zero or more occurrences of the pattern left to it.
Expression String Matched?
ma*n
mn 1 match
man 1 match
maaan 1 match
main No match (a is not followed by n)
Expression String Matched?
woman 1 match
+ - Plus
The plus symbol + matches one or more occurrences of the pattern left to it.
Expression String Matched?
ma+n
mn No match (no a character)
man 1 match
maaan 1 match
main No match (a is not followed by n)
woman 1 match
? - Question Mark
The question mark symbol ? matches zero or one occurrence of the pattern left to it.
Expression String Matched?
ma?n
mn 1 match
man 1 match
maaan No match (more than one a character)
main No match (a is not followed by n)
woman 1 match
{} - Braces
Consider this code: {n,m}. This means at least n, and at most m repetitions of the pattern left to it.
Expression String Matched?
a{2,3}
abc dat No match
abc daat 1 match (at daat)
aabc daaat 2 matches (at aabc and daaat)
aabc daaaat 2 matches (at aabc and daaaat)
Let's try one more example. This RegEx [0-9]{2, 4} matches at least 2 digits but not more than 4 digits
Expression String Matched?
[0-9]{2,4}
ab123csde 1 match (match at ab123csde)
12 and 345673 3 matches (12, 3456, 73)
1 and 2 No match
| - Alternation
Vertical bar | is used for alternation (or operator).
Expression String Matched?
a|b
cde No match
ade 1 match (match at ade)
acdbea 3 matches (at acdbea)
Here, a|b match any string that contains either a or b
() - Group
Parentheses () is used to group sub-patterns. For example, (a|b|c)xz match any string that matches
either a or b or c followed by xz
Expression String Matched?
(a|b|c)xz
ab xz No match
abxz 1 match (match at abxz)
axz cabxz 2 matches (at axzbc cabxz)
 - Backslash
Backlash  is used to escape various characters including all metacharacters. For example,
$a match if a string contains $ followed by a. Here, $ is not interpreted by a RegEx engine in a special
way.
If you are unsure if a character has special meaning or not, you can put  in front of it. This makes sure
the character is not treated in a special way.
Special Sequences
Special sequences make commonly used patterns easier to write. Here's a list of special sequences:
A - Matches if the specified characters are at the start of a string.
Expression String Matched?
Athe
the sun Match
In the sun No match
b - Matches if the specified characters are at the beginning or end of a word.
Expression String Matched?
bfoo
football Match
a football Match
afootball No match
foob
the foo Match
the afoo test Match
the afootest No match
B - Opposite of b. Matches if the specified characters are not at the beginning or end of a word.
Expression String Matched?
Bfoo
football No match
a football No match
afootball Match
fooB
the foo No match
the afoo test No match
the afootest Match
d - Matches any decimal digit. Equivalent to [0-9]
Expression String Matched?
d
12abc3 3 matches (at 12abc3)
Python No match
D - Matches any non-decimal digit. Equivalent to [^0-9]
Expression String Matched?
D
1ab34"50 3 matches (at 1ab34"50)
1345 No match
s - Matches where a string contains any whitespace character. Equivalent to [ tnrfv].
Expression String Matched?
s
Python RegEx 1 match
PythonRegEx No match
S - Matches where a string contains any non-whitespace character. Equivalent to [^ tnrfv].
Expression String Matched?
S
a b 2 matches (at a b)
No match
w - Matches any alphanumeric character (digits and alphabets). Equivalent to [a-zA-Z0-9_]. By the
way, underscore _ is also considered an alphanumeric character.
Expression String Matched?
w
12&": ;c 3 matches (at 12&": ;c)
%"> ! No match
W - Matches any non-alphanumeric character. Equivalent to [^a-zA-Z0-9_]
Expression String Matched?
W
1a2%c 1 match (at 1a2%c)
Python No match
Z - Matches if the specified characters are at the end of a string.
Expression String Matched?
PythonZ
I like Python 1 match
I like Python Programming No match
Python is fun. No match
Now we understood the basics of RegEx, let's discuss how to use RegEx in your Python code.
Python RegEx
Python has a module named re to work with regular expressions. To use it, we need to import the
module.
import re
The module defines several functions and constants to work with RegEx.
re.search()
The re.search() method takes two arguments: a pattern and a string. The method looks for the first
location where the RegEx pattern produces a match with the string.
If the search is successful, re.search() returns a match object; if not, it returns None.
Syntax of the function:
s = re.search(pattern, str)
Write a python program to perform the searching process or pattern matching using search()
function.
import re
string = "Python is fun"
s = re.search('Python', string)
if s:
print("pattern found inside the string")
else:
print("pattern not found")
Here, s contains a match object.
s.start(), s.end() and s.span()
The start() function returns the index of the start of the matched substring. Similarly, end() returns
the end index of the matched substring. The span() function returns a tuple containing start and end
index of the matched part.
>>> s.start()
0
>>> s.end()
6
>>> s.span()
(0, 6)
>>> s.group()
‘Python’
re.match()
The re.match() method takes two arguments: a pattern and a string. If the pattern is found at the
start of the string, then the method returns a match object. If not, it returns None.
Write a python program to perform the searching process or pattern matching using match()
function.
import re
pattern = '^a...s$'
test_string = 'abyss'
result = re.match(pattern, test_string)
if result:
print("Search successful.")
else:
print("Search unsuccessful.")
Here, we used re.match() function to search pattern within the test_string.
re.sub()
The syntax of re.sub() is:
re.sub(pattern, replace, string)
The method returns a string where matched occurrences are replaced with the content of replace
variable.
If the pattern is not found, re.sub() returns the original string.
You can pass count as a fourth parameter to the re.sub() method. If omited, it results to 0. This will
replace all occurrences.
Example1:
re.sub('^a','b','aaa')
Output:
'baa'
Example2:
s=re.sub('a','b','aaa')
print(s)
Output:
‘bbb’
Example3:
s=re.sub('a','b','aaa',2)
print(s)
Output:
‘bba’
re.subn()
The re.subn() is similar to re.sub() expect it returns a tuple of 2 items containing the new string and
the number of substitutions made.
Example1:
s=re.subn('a','b','aaa')
print(s)
Output:
(‘bbb’, 3)
re.findall()
The re.findall() method returns a list of strings containing all matches.
If the pattern is not found, re.findall() returns an empty list.
Syntax:
re.findall(pattern, string)
Example1:
s=re.findall('a','abab')
print(s)
Output:
['a', 'a']
re.split()
The re.split method splits the string where there is a match and returns a list of strings where the
splits have occurred.
If the pattern is not found, re.split() returns a list containing the original string.
You can pass maxsplit argument to the re.split() method. It's the maximum number of splits
that will occur.
By the way, the default value of maxsplit is 0; meaning all possible splits.
Syntax:
re.split(pattern, string)
Example1:
s=re.split('a','abab')
print(s)
Output:
['', 'b', 'b']
Example2:
s=re.split('a','aababa',3)
print(s)
Output:
['', '', 'b', 'ba']
CASE STUDY
Street Addresses: In this case study, we will take one street address as input and try to perform some
operations on the input by making use of library functions.
Example:
str1='100 NORTH MAIN ROAD'
str1.replace('ROAD','RD')
Output:
'100 NORTH MAIN RD'
str1.replace('NORTH','NRTH')
Output:
'100 NRTH MAIN ROAD'
re.sub('ROAD','RD',str1)
Output:
'100 NORTH MAIN RD'
re.sub('NORTH','NRTH',str1)
Output:
'100 NRTH MAIN ROAD'
re.split('A',str1)
Output:
['100 NORTH M', 'IN RO', 'D']
re.findall('O',str1)
Output:
['O', 'O']
re.sub('^1','2',str1)
Output:
'200 NORTH MAIN ROAD'
Roman Numerals
I = 1
V = 5
X = 10
L = 50
C = 100
D = 500
M = 1000
For writing 4, we will write the roman number representation as IV. For 9, we will write as IX. For
40, we can write as XL. For 90, we can write as XC. For 900, we can write as CM.
Let us write the roman number representation for few numbers.
Ex1:
1940
MCMXL
Ex2:
1946
MCMXLVI
Ex3:
1940
MCMXL
Ex4:
1888
MDCCCLXXXVIII
Checking for thousands:
1000=M
2000=MM
3000=MMM
Possible pattern is to have M in it.
Example:
pattern = '^M?M?M?$'
re.search(pattern, 'M')
Output:
<re.Match object; span=(0, 1), match='M'>
re.search(pattern, 'MM')
Output:
<re.Match object; span=(0, 2), match='MM'>
re.search(pattern, 'MMM')
Output:
<re.Match object; span=(0, 3), match='MMM'>
re.search(pattern, 'ML')
re.search(pattern, 'MX')
re.search(pattern, 'MI')
re.search(pattern, 'MMMM')
Checking for Hundreds:
100=C
200=CC
300=CCC
400=CD
500=D
600=DC
700=DCC
800=DCCC
900=CM
Example:
pattern = '^M?M?M?(CM|CD|D?C?C?C?)$'
re.search(pattern,'MCM')
Output:
<re.Match object; span=(0, 3), match='MCM'>
re.search(pattern,’MD’)
Output:
<re.Match object; span=(0, 2), match='MD'>
re.search(pattern,'MMMCCC')
Output:
<re.Match object; span=(0, 6), match='MMMCCC'>
re.search(pattern,'MCMLXX')
Using the {n,m} syntax
We will check in the string, where in the pattern occurs at least minimum ‘n’ times and at most
maximum ‘m’ times.
Example:
pattern='^M{0,3}$'
re.search(pattern,'MM')
Output:
<re.Match object; span=(0, 2), match='MM'>
re.search(pattern,'M')
Output:
<re.Match object; span=(0, 1), match='M'>
re.search(pattern,'MMM')
Output:
<re.Match object; span=(0, 3), match='MMM'>
Checking for Tens and Ones:
1=I
2=II
3=III
4=IV
5=V
6=VI
7=VII
8=VIII
9=IX
10=X
20=XX
30=XXX
40=XL
50=L
60=LX
70=LXX
80=LXXX
90=XC
Example:
pattern='^M?M?M?(CM|CD|D?C?C?C?)(XC|XL|L?X?X?X?)(IX|IV|V?I?I?I?)$'
re.search(pattern,'MDLVI')
Output:
<re.Match object; span=(0, 5), match='MDLVI'>
re.search(pattern,'MCMXLVI')
Output:
<re.Match object; span=(0, 7), match='MCMXLVI'>
re.search(pattern,'MMMCCCXLV')
Output:
<re.Match object; span=(0, 9), match='MMMCCCXLV'>
Ad

More Related Content

What's hot (20)

Strings IN C
Strings IN CStrings IN C
Strings IN C
yndaravind
 
Recursion - Algorithms and Data Structures
Recursion - Algorithms and Data StructuresRecursion - Algorithms and Data Structures
Recursion - Algorithms and Data Structures
Priyanka Rana
 
Python unit 2 as per Anna university syllabus
Python unit 2 as per Anna university syllabusPython unit 2 as per Anna university syllabus
Python unit 2 as per Anna university syllabus
DhivyaSubramaniyam
 
16. Java stacks and queues
16. Java stacks and queues16. Java stacks and queues
16. Java stacks and queues
Intro C# Book
 
Java Stack Data Structure.pptx
Java Stack Data Structure.pptxJava Stack Data Structure.pptx
Java Stack Data Structure.pptx
vishal choudhary
 
List in java
List in javaList in java
List in java
nitin kumar
 
Regular Expressions in Java
Regular Expressions in JavaRegular Expressions in Java
Regular Expressions in Java
OblivionWalker
 
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYAPYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
Maulik Borsaniya
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
Satya Narayana
 
Graphs
GraphsGraphs
Graphs
amudha arul
 
Python programming : Arrays
Python programming : ArraysPython programming : Arrays
Python programming : Arrays
Emertxe Information Technologies Pvt Ltd
 
Arrays in python
Arrays in pythonArrays in python
Arrays in python
moazamali28
 
Character Array and String
Character Array and StringCharacter Array and String
Character Array and String
Tasnima Hamid
 
Python strings
Python stringsPython strings
Python strings
Mohammed Sikander
 
Stack and queue
Stack and queueStack and queue
Stack and queue
CHANDAN KUMAR
 
Data Structure (Stack)
Data Structure (Stack)Data Structure (Stack)
Data Structure (Stack)
Adam Mukharil Bachtiar
 
Presentation on-exception-handling
Presentation on-exception-handlingPresentation on-exception-handling
Presentation on-exception-handling
Nahian Ahmed
 
Heap tree
Heap treeHeap tree
Heap tree
Shankar Bishnoi
 
Binary search in ds
Binary search in dsBinary search in ds
Binary search in ds
chauhankapil
 
Prim's algorithm
Prim's algorithmPrim's algorithm
Prim's algorithm
Pankaj Thakur
 
Recursion - Algorithms and Data Structures
Recursion - Algorithms and Data StructuresRecursion - Algorithms and Data Structures
Recursion - Algorithms and Data Structures
Priyanka Rana
 
Python unit 2 as per Anna university syllabus
Python unit 2 as per Anna university syllabusPython unit 2 as per Anna university syllabus
Python unit 2 as per Anna university syllabus
DhivyaSubramaniyam
 
16. Java stacks and queues
16. Java stacks and queues16. Java stacks and queues
16. Java stacks and queues
Intro C# Book
 
Java Stack Data Structure.pptx
Java Stack Data Structure.pptxJava Stack Data Structure.pptx
Java Stack Data Structure.pptx
vishal choudhary
 
Regular Expressions in Java
Regular Expressions in JavaRegular Expressions in Java
Regular Expressions in Java
OblivionWalker
 
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYAPYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
Maulik Borsaniya
 
Arrays in python
Arrays in pythonArrays in python
Arrays in python
moazamali28
 
Character Array and String
Character Array and StringCharacter Array and String
Character Array and String
Tasnima Hamid
 
Presentation on-exception-handling
Presentation on-exception-handlingPresentation on-exception-handling
Presentation on-exception-handling
Nahian Ahmed
 
Binary search in ds
Binary search in dsBinary search in ds
Binary search in ds
chauhankapil
 

Similar to Python (regular expression) (20)

regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdf
DarellMuchoko
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
Raj Gupta
 
Python - Lecture 7
Python - Lecture 7Python - Lecture 7
Python - Lecture 7
Ravi Kiran Khareedi
 
Les08
Les08Les08
Les08
Sudharsan S
 
16 Java Regex
16 Java Regex16 Java Regex
16 Java Regex
wayn
 
regex.pptx
regex.pptxregex.pptx
regex.pptx
qnuslv
 
Adv. python regular expression by Rj
Adv. python regular expression by RjAdv. python regular expression by Rj
Adv. python regular expression by Rj
Shree M.L.Kakadiya MCA mahila college, Amreli
 
unit-4 regular expression.pptx
unit-4 regular expression.pptxunit-4 regular expression.pptx
unit-4 regular expression.pptx
PadreBhoj
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
DurgaNayak4
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressions
mussawir20
 
Python - Regular Expressions
Python - Regular ExpressionsPython - Regular Expressions
Python - Regular Expressions
Mukesh Tekwani
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
Thomas Langston
 
Python : Regular expressions
Python : Regular expressionsPython : Regular expressions
Python : Regular expressions
Emertxe Information Technologies Pvt Ltd
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
Logan Palanisamy
 
SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)
Logan Palanisamy
 
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdfFUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
Bryan Alejos
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular  Expressions,  Dictionaries.pdfModule 3 - Regular  Expressions,  Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdf
GaneshRaghu4
 
Regex Basics
Regex BasicsRegex Basics
Regex Basics
Jeremy Coates
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
Sujith Kumar
 
Regex lecture
Regex lectureRegex lecture
Regex lecture
Jun Shimizu
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdf
DarellMuchoko
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
Raj Gupta
 
16 Java Regex
16 Java Regex16 Java Regex
16 Java Regex
wayn
 
regex.pptx
regex.pptxregex.pptx
regex.pptx
qnuslv
 
unit-4 regular expression.pptx
unit-4 regular expression.pptxunit-4 regular expression.pptx
unit-4 regular expression.pptx
PadreBhoj
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
DurgaNayak4
 
Php String And Regular Expressions
Php String  And Regular ExpressionsPhp String  And Regular Expressions
Php String And Regular Expressions
mussawir20
 
Python - Regular Expressions
Python - Regular ExpressionsPython - Regular Expressions
Python - Regular Expressions
Mukesh Tekwani
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
Logan Palanisamy
 
SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)SQL for pattern matching (Oracle 12c)
SQL for pattern matching (Oracle 12c)
Logan Palanisamy
 
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdfFUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
FUNDAMENTALS OF REGULAR EXPRESSION (RegEX).pdf
Bryan Alejos
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular  Expressions,  Dictionaries.pdfModule 3 - Regular  Expressions,  Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdf
GaneshRaghu4
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
Sujith Kumar
 
Ad

Recently uploaded (20)

Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.
anuragmk56
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdfMAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
ssuser562df4
 
Reagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptxReagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptx
AlejandroOdio
 
Machine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptxMachine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptx
rajeswari89780
 
π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalizationπ0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization
NABLAS株式会社
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Metal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistryMetal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistry
mee23nu
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptxLidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
RishavKumar530754
 
Smart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptxSmart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptx
rushikeshnavghare94
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design ThinkingDT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DhruvChotaliya2
 
Data Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptxData Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptx
RushaliDeshmukh2
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
ELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdfELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdf
Shiju Jacob
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
The Gaussian Process Modeling Module in UQLab
The Gaussian Process Modeling Module in UQLabThe Gaussian Process Modeling Module in UQLab
The Gaussian Process Modeling Module in UQLab
Journal of Soft Computing in Civil Engineering
 
International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)
samueljackson3773
 
Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.Fort night presentation new0903 pdf.pdf.
Fort night presentation new0903 pdf.pdf.
anuragmk56
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdfMAQUINARIA MINAS CEMA 6th Edition (1).pdf
MAQUINARIA MINAS CEMA 6th Edition (1).pdf
ssuser562df4
 
Reagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptxReagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptx
AlejandroOdio
 
Machine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptxMachine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptx
rajeswari89780
 
π0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalizationπ0.5: a Vision-Language-Action Model with Open-World Generalization
π0.5: a Vision-Language-Action Model with Open-World Generalization
NABLAS株式会社
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Metal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistryMetal alkyne complexes.pptx in chemistry
Metal alkyne complexes.pptx in chemistry
mee23nu
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptxLidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
Lidar for Autonomous Driving, LiDAR Mapping for Driverless Cars.pptx
RishavKumar530754
 
Smart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptxSmart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptx
rushikeshnavghare94
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design ThinkingDT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DhruvChotaliya2
 
Data Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptxData Structures_Searching and Sorting.pptx
Data Structures_Searching and Sorting.pptx
RushaliDeshmukh2
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
ELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdfELectronics Boards & Product Testing_Shiju.pdf
ELectronics Boards & Product Testing_Shiju.pdf
Shiju Jacob
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)
samueljackson3773
 
Ad

Python (regular expression)

  • 1. Regular Expressions A Regular Expression (RegEx) is a sequence of characters that defines a search pattern. For example, ^a...s$ The above code defines a RegEx pattern. The pattern is: any five letter string starting with a and ending with s. A pattern defined using RegEx can be used to match against a string. Expression String Matched? ^a...s$ abs No match alias Match abyss Match Alias No match An abacus No match Specify Pattern Using RegEx To specify regular expressions, metacharacters are used. In the above example, ^ and $ are metacharacters. MetaCharacters Metacharacters are characters that are interpreted in a special way by a RegEx engine. Here's a list of metacharacters:
  • 2. [] . ^ $ * + ? {} () | [] - Square brackets Square brackets specify a set of characters you wish to match. Expression String Matched? [abc] a 1 match ac 2 matches Hey Jude No match abc de ca 5 matches Here, [abc] will match if the string you are trying to match contains any of the a, b or c. You can also specify a range of characters using - inside square brackets. • [a-e] is the same as [abcde]. • [1-4] is the same as [1234]. • [0-39] is the same as [01239]. You can complement (invert) the character set by using caret ^ symbol at the start of a square- bracket. • [^abc] means any character except a or b or c. • [^0-9] means any non-digit character. . - Period A period matches any single character (except newline 'n').
  • 3. Expression String Matched? .. a No match ac 1 match acd 1 match acde 2 matches (contains 4 characters) ^ - Caret The caret symbol ^ is used to check if a string starts with a certain character. Expression String Matched? ^a a 1 match abc 1 match bac No match ^ab abc 1 match acb No match (starts with a but not followed by b)
  • 4. $ - Dollar The dollar symbol $ is used to check if a string ends with a certain character. Expression String Matched? a$ a 1 match formula 1 match cab No match * - Star The star symbol * matches zero or more occurrences of the pattern left to it. Expression String Matched? ma*n mn 1 match man 1 match maaan 1 match main No match (a is not followed by n)
  • 5. Expression String Matched? woman 1 match + - Plus The plus symbol + matches one or more occurrences of the pattern left to it. Expression String Matched? ma+n mn No match (no a character) man 1 match maaan 1 match main No match (a is not followed by n) woman 1 match ? - Question Mark The question mark symbol ? matches zero or one occurrence of the pattern left to it.
  • 6. Expression String Matched? ma?n mn 1 match man 1 match maaan No match (more than one a character) main No match (a is not followed by n) woman 1 match {} - Braces Consider this code: {n,m}. This means at least n, and at most m repetitions of the pattern left to it. Expression String Matched? a{2,3} abc dat No match abc daat 1 match (at daat) aabc daaat 2 matches (at aabc and daaat) aabc daaaat 2 matches (at aabc and daaaat)
  • 7. Let's try one more example. This RegEx [0-9]{2, 4} matches at least 2 digits but not more than 4 digits Expression String Matched? [0-9]{2,4} ab123csde 1 match (match at ab123csde) 12 and 345673 3 matches (12, 3456, 73) 1 and 2 No match | - Alternation Vertical bar | is used for alternation (or operator). Expression String Matched? a|b cde No match ade 1 match (match at ade) acdbea 3 matches (at acdbea) Here, a|b match any string that contains either a or b () - Group
  • 8. Parentheses () is used to group sub-patterns. For example, (a|b|c)xz match any string that matches either a or b or c followed by xz Expression String Matched? (a|b|c)xz ab xz No match abxz 1 match (match at abxz) axz cabxz 2 matches (at axzbc cabxz) - Backslash Backlash is used to escape various characters including all metacharacters. For example, $a match if a string contains $ followed by a. Here, $ is not interpreted by a RegEx engine in a special way. If you are unsure if a character has special meaning or not, you can put in front of it. This makes sure the character is not treated in a special way. Special Sequences Special sequences make commonly used patterns easier to write. Here's a list of special sequences: A - Matches if the specified characters are at the start of a string.
  • 9. Expression String Matched? Athe the sun Match In the sun No match b - Matches if the specified characters are at the beginning or end of a word. Expression String Matched? bfoo football Match a football Match afootball No match foob the foo Match the afoo test Match the afootest No match B - Opposite of b. Matches if the specified characters are not at the beginning or end of a word.
  • 10. Expression String Matched? Bfoo football No match a football No match afootball Match fooB the foo No match the afoo test No match the afootest Match d - Matches any decimal digit. Equivalent to [0-9] Expression String Matched? d 12abc3 3 matches (at 12abc3) Python No match D - Matches any non-decimal digit. Equivalent to [^0-9]
  • 11. Expression String Matched? D 1ab34"50 3 matches (at 1ab34"50) 1345 No match s - Matches where a string contains any whitespace character. Equivalent to [ tnrfv]. Expression String Matched? s Python RegEx 1 match PythonRegEx No match S - Matches where a string contains any non-whitespace character. Equivalent to [^ tnrfv]. Expression String Matched? S a b 2 matches (at a b) No match w - Matches any alphanumeric character (digits and alphabets). Equivalent to [a-zA-Z0-9_]. By the way, underscore _ is also considered an alphanumeric character.
  • 12. Expression String Matched? w 12&": ;c 3 matches (at 12&": ;c) %"> ! No match W - Matches any non-alphanumeric character. Equivalent to [^a-zA-Z0-9_] Expression String Matched? W 1a2%c 1 match (at 1a2%c) Python No match Z - Matches if the specified characters are at the end of a string. Expression String Matched? PythonZ I like Python 1 match I like Python Programming No match Python is fun. No match Now we understood the basics of RegEx, let's discuss how to use RegEx in your Python code.
  • 13. Python RegEx Python has a module named re to work with regular expressions. To use it, we need to import the module. import re The module defines several functions and constants to work with RegEx. re.search() The re.search() method takes two arguments: a pattern and a string. The method looks for the first location where the RegEx pattern produces a match with the string. If the search is successful, re.search() returns a match object; if not, it returns None. Syntax of the function: s = re.search(pattern, str) Write a python program to perform the searching process or pattern matching using search() function. import re string = "Python is fun" s = re.search('Python', string) if s: print("pattern found inside the string") else: print("pattern not found")
  • 14. Here, s contains a match object. s.start(), s.end() and s.span() The start() function returns the index of the start of the matched substring. Similarly, end() returns the end index of the matched substring. The span() function returns a tuple containing start and end index of the matched part. >>> s.start() 0 >>> s.end() 6 >>> s.span() (0, 6) >>> s.group() ‘Python’ re.match() The re.match() method takes two arguments: a pattern and a string. If the pattern is found at the start of the string, then the method returns a match object. If not, it returns None. Write a python program to perform the searching process or pattern matching using match() function. import re pattern = '^a...s$' test_string = 'abyss' result = re.match(pattern, test_string) if result: print("Search successful.") else: print("Search unsuccessful.") Here, we used re.match() function to search pattern within the test_string.
  • 15. re.sub() The syntax of re.sub() is: re.sub(pattern, replace, string) The method returns a string where matched occurrences are replaced with the content of replace variable. If the pattern is not found, re.sub() returns the original string. You can pass count as a fourth parameter to the re.sub() method. If omited, it results to 0. This will replace all occurrences. Example1: re.sub('^a','b','aaa') Output: 'baa' Example2: s=re.sub('a','b','aaa') print(s) Output: ‘bbb’ Example3: s=re.sub('a','b','aaa',2) print(s) Output: ‘bba’ re.subn() The re.subn() is similar to re.sub() expect it returns a tuple of 2 items containing the new string and the number of substitutions made. Example1: s=re.subn('a','b','aaa') print(s) Output: (‘bbb’, 3)
  • 16. re.findall() The re.findall() method returns a list of strings containing all matches. If the pattern is not found, re.findall() returns an empty list. Syntax: re.findall(pattern, string) Example1: s=re.findall('a','abab') print(s) Output: ['a', 'a'] re.split() The re.split method splits the string where there is a match and returns a list of strings where the splits have occurred. If the pattern is not found, re.split() returns a list containing the original string. You can pass maxsplit argument to the re.split() method. It's the maximum number of splits that will occur. By the way, the default value of maxsplit is 0; meaning all possible splits. Syntax: re.split(pattern, string) Example1: s=re.split('a','abab') print(s) Output: ['', 'b', 'b'] Example2: s=re.split('a','aababa',3) print(s) Output: ['', '', 'b', 'ba']
  • 17. CASE STUDY Street Addresses: In this case study, we will take one street address as input and try to perform some operations on the input by making use of library functions. Example: str1='100 NORTH MAIN ROAD' str1.replace('ROAD','RD') Output: '100 NORTH MAIN RD' str1.replace('NORTH','NRTH') Output: '100 NRTH MAIN ROAD' re.sub('ROAD','RD',str1) Output: '100 NORTH MAIN RD' re.sub('NORTH','NRTH',str1) Output: '100 NRTH MAIN ROAD' re.split('A',str1) Output: ['100 NORTH M', 'IN RO', 'D'] re.findall('O',str1) Output: ['O', 'O'] re.sub('^1','2',str1) Output: '200 NORTH MAIN ROAD' Roman Numerals I = 1 V = 5 X = 10
  • 18. L = 50 C = 100 D = 500 M = 1000 For writing 4, we will write the roman number representation as IV. For 9, we will write as IX. For 40, we can write as XL. For 90, we can write as XC. For 900, we can write as CM. Let us write the roman number representation for few numbers. Ex1: 1940 MCMXL Ex2: 1946 MCMXLVI Ex3: 1940 MCMXL Ex4: 1888 MDCCCLXXXVIII Checking for thousands: 1000=M 2000=MM 3000=MMM Possible pattern is to have M in it. Example: pattern = '^M?M?M?$' re.search(pattern, 'M') Output: <re.Match object; span=(0, 1), match='M'> re.search(pattern, 'MM') Output: <re.Match object; span=(0, 2), match='MM'> re.search(pattern, 'MMM')
  • 19. Output: <re.Match object; span=(0, 3), match='MMM'> re.search(pattern, 'ML') re.search(pattern, 'MX') re.search(pattern, 'MI') re.search(pattern, 'MMMM') Checking for Hundreds: 100=C 200=CC 300=CCC 400=CD 500=D 600=DC 700=DCC 800=DCCC 900=CM Example: pattern = '^M?M?M?(CM|CD|D?C?C?C?)$' re.search(pattern,'MCM') Output: <re.Match object; span=(0, 3), match='MCM'> re.search(pattern,’MD’) Output: <re.Match object; span=(0, 2), match='MD'> re.search(pattern,'MMMCCC') Output: <re.Match object; span=(0, 6), match='MMMCCC'> re.search(pattern,'MCMLXX') Using the {n,m} syntax
  • 20. We will check in the string, where in the pattern occurs at least minimum ‘n’ times and at most maximum ‘m’ times. Example: pattern='^M{0,3}$' re.search(pattern,'MM') Output: <re.Match object; span=(0, 2), match='MM'> re.search(pattern,'M') Output: <re.Match object; span=(0, 1), match='M'> re.search(pattern,'MMM') Output: <re.Match object; span=(0, 3), match='MMM'> Checking for Tens and Ones: 1=I 2=II 3=III 4=IV 5=V 6=VI 7=VII 8=VIII 9=IX 10=X 20=XX 30=XXX 40=XL 50=L 60=LX 70=LXX 80=LXXX 90=XC Example: pattern='^M?M?M?(CM|CD|D?C?C?C?)(XC|XL|L?X?X?X?)(IX|IV|V?I?I?I?)$' re.search(pattern,'MDLVI') Output: <re.Match object; span=(0, 5), match='MDLVI'> re.search(pattern,'MCMXLVI') Output: <re.Match object; span=(0, 7), match='MCMXLVI'> re.search(pattern,'MMMCCCXLV') Output: <re.Match object; span=(0, 9), match='MMMCCCXLV'>