SlideShare a Scribd company logo
Perl Programming
                 Course
             Working with text
            Regular expressions



Krassimir Berov

I-can.eu
Contents
1. Simple word matching
2. Character classes
3. Matching this or that
4. Grouping
5. Extracting matches
6. Matching repetitions
7. Search and replace
8. The split operator
Simple word matching
• It's all about identifying patterns in text
• The simplest regex is simply
  a word – a string of characters.
• A regex consisting of a word matches any
  string that contains that word
• The sense of the match can be reversed
  by using !~ operator
 my $string ='some probably big string containing just
 about anything in it';
 print "found 'string'n" if $string =~ /string/;
 print "it is not about dogsn" if $string !~ /dog/;
Simple word matching
                                                  (2)
• The literal string in the regex can be
  replaced by a variable
• If matching against $_ , the $_ =~ part
  can be omitted:

 my $string ='stringify this world';
 my $word = 'string'; my $animal = 'dog';
 print "found '$word'n" if $string =~ /$word/;
 print "it is not about ${animal}sn"
     if $string !~ /$animal/;
 for('dog','string','dog'){
     print "$wordn" if /$word/
 }
Simple word matching
                                                   (3)
• The // default delimiters for a match can be
  changed to arbitrary delimiters by putting an 'm'
  in front
• Regexes must match a part of the string exactly
  in order for the statement to be true
 my $string ='Stringify this world!';
 my $word = 'string'; my $animal = 'dog';
 print "found '$word'n" if $string =~ m#$word#;
 print "found '$word' in any casen"
    if $string =~ m#$word#i;
 print "it is not about ${animal}sn"
     if $string !~ m($animal);
 for('dog','string','Dog'){
     local $=$/;
     print if m|$animal|
 }
Simple word matching
                                                     (4)
• perl will always match at the earliest possible
  point in the string
 my $string ='Stringify this stringy world!';
 my $word = 'string';
 print "found '$word' in any casen"
    if $string =~ m{$word}i;

• Some characters, called metacharacters, are
  reserved for use in regex notation. The
  metacharacters are (14):
 { } [ ] ( ) ^ $ . | * + ? 

• A metacharacter can be matched by putting a
  backslash before it
 print "The string n'$string'n contains a DOTn"
     if $string =~ m|.|;
Simple word matching
                                                        (5)
• Non-printable ASCII characters are
  represented by escape sequences
• Arbitrary bytes are represented by octal
  escape sequences
 use utf8;
 binmode(STDOUT, ':utf8') if $ENV{LANG} =~/UTF-8/;
 $=$/;
 my $string ="containsrn Then we have sometttabs.
 б";
 print 'matched б(x{431})'
     if $string =~ /x{431}/;
 print 'matched б' if $string =~/б/;
 print 'matched rn' if $string =~/rn/;
 print 'The string was:"' . $string.'"';
Simple word matching
                                                          (6)
• To specify where it should match, use the
  anchor metacharacters ^ and $ .



 use strict; use warnings;$=$/;
 my $string ='A probably long chunk of text containing
 strings';
 print 'matched "A"' if $string =~ /^A/;
 print 'matched "strings"' if $string =~ /strings$/;
 print 'matched "A", matched "strings" and something in
 between'
 if $string =~ /^A.*?strings$/;
Character classes
• A character class allows a set of possible
  characters, to match
• Character classes are denoted by brackets [ ]
  with the set of characters to be possibly matched
  inside
• The special characters for a character class are
  - ]  ^ $ and are matched using an escape
• The special character '-' acts as a range operator
  within character classes so you can write [0-9]
  and [a-z]
Character classes
• Example
 use strict; use warnings;$=$/;
 my $string ='A probably long chunk of text containing
 strings';
 my $thing = 'ong ung ang enanything';
 my $every = 'iiiiii';
 my $nums   = 'I have 4325 Euro';
 my $class = 'dog';
 print 'matched any of a, b or c'
    if $string =~ /[abc]/;

 for($thing, $every, $string){
     print 'ingy brrrings nothing using: '.$_
         if /[$class]/
 }
 print $nums if $nums =~/[0-9]/;
Character classes
• Perl has several abbreviations for common character
  classes
   • d is a digit – [0-9]
   • s is a whitespace character – [ trnf]
   • w is a word character
     (alphanumeric or _) – [0-9a-zA-Z_]
   • D is a negated d – any character but a digit [^0-9]
   • S is a negated s; it represents any non-whitespace
     character [^s]
   • W is a negated w – any non-word character
   • The period '.' matches any character but "n"
   • The dswDSW inside and outside of character classes
   • The word anchor b matches a boundary between a word
     character and a non-word character wW or Ww
Character classes
• Example
 my $digits ="here are some digits3434 and then ";

 print 'found digit' if $digits =~/d/;

 print 'found alphanumeric' if $digits =~/w/;

 print 'found space' if $digits =~/s/;

 print 'digit followed by space, followed by letter'
    if $digits =~/ds[A-z]/;
Matching this or that
• We can match different character strings
  with the alternation metacharacter '|'
• perl will try to match the regex at the
  earliest possible point in the string
 my $digits ="here are some digits3434 and then ";

 print 'found "are" or "and"' if $digits =~/are|and/;
Extracting matches
• The grouping metacharacters () also
  allow the extraction of the parts of a
  string that matched
• For each grouping, the part that matched
  inside goes into the special variables $1 ,
  $2 , etc.
• They can be used just as ordinary
  variables
Extracting matches
• The grouping metacharacters () allow a part of a
  regex to be treated as a single unit
• If the groupings in a regex are nested, $1 gets the
  group with the leftmost opening parenthesis, $2
  the next opening parenthesis, etc.
  my $digits ="here are some digits3434 and then678 ";

  print 'found a letter followed by leters or digits":'.$1
      if $digits =~/[a-z]([a-z]|d+)/;
  print 'found a letter followed by digits":'.$1
      if $digits =~/([a-z](d+))/;
  #                   $1    $2
  print 'found letters followed by digits":'.$1
      if $digits =~/([a-z]+)(d+)/;
  #                   $1     $2
Matching repetitions
• The quantifier metacharacters ?, * , + , and {} allow
  us to determine the number of repeats of a portion
  of a regex
• Quantifiers are put immediately after the character,
  character class, or grouping
   • a? = match 'a' 1 or 0 times
   • a* = match 'a' 0 or more times, i.e., any number of times
   • a+ = match 'a' 1 or more times, i.e., at least once
   • a{n,m} = match at least n times, but not more than m
     times
   • a{n,} = match at least n or more times
   • a{n} = match exactly n times
Matching repetitions


use strict; use warnings;$=$/;
my $digits ="here are some digits3434 and then678 ";

print 'found some letters followed by leters or
digits":'.$1 .$2
if $digits =~/([a-z]{2,})(w+)/;

print 'found three letter followed by   digits":'.$1 .$2
if $digits =~/([a-z]{3}(d+))/;

print 'found up to four letters followed by   digits":'.
$1 .$2
if $digits =~/([a-z]{1,4})(d+)/;
Matching repetitions
• Greeeedy

 use strict; use warnings;$=$/;
 my $digits ="here are some digits3434 and then678 ";

 print 'found as much as possible letters
 followed by digits":'.$1 .$2
 if $digits =~/([a-z]*)(d+)/;
Search and replace
• Search and replace is performed using
  s/regex/replacement/modifiers.
• The replacement is a Perl double quoted string
  that replaces in the string whatever is matched with
  the regex .
• The operator =~ is used to associate a string with
  s///.
• If matching against $_ , the $_ =~ can be dropped.
• If there is a match, s/// returns the number of
  substitutions made, otherwise it returns false
Search and replace
• The matched variables $1 , $2 , etc. are immediately
  available for use in the replacement expression.
• With the global modifier, s///g will search and
  replace all occurrences of the regex in the string
• The evaluation modifier s///e wraps an eval{...}
  around the replacement string and the evaluated
  result is substituted for the matched substring.
• s/// can use other delimiters, such as s!!! and s{}{},
  and even s{}//
• If single quotes are used s''', then the regex and
  replacement are treated as single quoted strings
Search and replace
• Example

 #TODO....
The split operator
• split /regex/, string
 splits string into a list of substrings and
 returns that list
• The regex determines the character sequence
  that string is split with respect to
 #TODO....
Regular expressions

• Resources
  • perlrequick - Perl regular expressions quick start
  • perlre - Perl regular expressions
  • perlreref - Perl Regular Expressions Reference
  • Beginning Perl
    (Chapter 5 – Regular Expressions)
Regular expressions




Questions?
Ad

More Related Content

What's hot (19)

Perl programming language
Perl programming languagePerl programming language
Perl programming language
Elie Obeid
 
Intro to Perl and Bioperl
Intro to Perl and BioperlIntro to Perl and Bioperl
Intro to Perl and Bioperl
Bioinformatics and Computational Biosciences Branch
 
Introduction to Perl - Day 1
Introduction to Perl - Day 1Introduction to Perl - Day 1
Introduction to Perl - Day 1
Dave Cross
 
Perl 5.10 for People Who Aren't Totally Insane
Perl 5.10 for People Who Aren't Totally InsanePerl 5.10 for People Who Aren't Totally Insane
Perl 5.10 for People Who Aren't Totally Insane
Ricardo Signes
 
Class 5 - PHP Strings
Class 5 - PHP StringsClass 5 - PHP Strings
Class 5 - PHP Strings
Ahmed Swilam
 
Bioinformatics p1-perl-introduction v2013
Bioinformatics p1-perl-introduction v2013Bioinformatics p1-perl-introduction v2013
Bioinformatics p1-perl-introduction v2013
Prof. Wim Van Criekinge
 
Intoduction to php strings
Intoduction to php  stringsIntoduction to php  strings
Intoduction to php strings
baabtra.com - No. 1 supplier of quality freshers
 
Introduction to Perl and BioPerl
Introduction to Perl and BioPerlIntroduction to Perl and BioPerl
Introduction to Perl and BioPerl
Bioinformatics and Computational Biosciences Branch
 
DBIx::Class introduction - 2010
DBIx::Class introduction - 2010DBIx::Class introduction - 2010
DBIx::Class introduction - 2010
leo lapworth
 
LPW: Beginners Perl
LPW: Beginners PerlLPW: Beginners Perl
LPW: Beginners Perl
Dave Cross
 
PHP Strings and Patterns
PHP Strings and PatternsPHP Strings and Patterns
PHP Strings and Patterns
Henry Osborne
 
Intermediate Perl
Intermediate PerlIntermediate Perl
Intermediate Perl
Dave Cross
 
String variable in php
String variable in phpString variable in php
String variable in php
chantholnet
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Introduction to Perl
Dave Cross
 
Scripting3
Scripting3Scripting3
Scripting3
Nao Dara
 
Perl 101 - The Basics of Perl Programming
Perl  101 - The Basics of Perl ProgrammingPerl  101 - The Basics of Perl Programming
Perl 101 - The Basics of Perl Programming
Utkarsh Sengar
 
You Can Do It! Start Using Perl to Handle Your Voyager Needs
You Can Do It! Start Using Perl to Handle Your Voyager NeedsYou Can Do It! Start Using Perl to Handle Your Voyager Needs
You Can Do It! Start Using Perl to Handle Your Voyager Needs
Roy Zimmer
 
Introduction to Perl Best Practices
Introduction to Perl Best PracticesIntroduction to Perl Best Practices
Introduction to Perl Best Practices
José Castro
 
DBIx::Class beginners
DBIx::Class beginnersDBIx::Class beginners
DBIx::Class beginners
leo lapworth
 

Similar to Working with text, Regular expressions (20)

Lecture 23
Lecture 23Lecture 23
Lecture 23
rhshriva
 
Bioinformatics p2-p3-perl-regexes v2014
Bioinformatics p2-p3-perl-regexes v2014Bioinformatics p2-p3-perl-regexes v2014
Bioinformatics p2-p3-perl-regexes v2014
Prof. Wim Van Criekinge
 
regex.ppt
regex.pptregex.ppt
regex.ppt
ansariparveen06
 
Bioinformatica p2-p3-introduction
Bioinformatica p2-p3-introductionBioinformatica p2-p3-introduction
Bioinformatica p2-p3-introduction
Prof. Wim Van Criekinge
 
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekingeBioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Prof. Wim Van Criekinge
 
Basta mastering regex power
Basta mastering regex powerBasta mastering regex power
Basta mastering regex power
Max Kleiner
 
Php Chapter 4 Training
Php Chapter 4 TrainingPhp Chapter 4 Training
Php Chapter 4 Training
Chris Chubb
 
Regular Expressions grep and egrep
Regular Expressions grep and egrepRegular Expressions grep and egrep
Regular Expressions grep and egrep
Tri Truong
 
Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013
Ben Brumfield
 
PERL Regular Expression
PERL Regular ExpressionPERL Regular Expression
PERL Regular Expression
Binsent Ribera
 
Bioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionBioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introduction
Prof. Wim Van Criekinge
 
Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...
Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...
Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...
Andrea Telatin
 
Regular Expressions
Regular ExpressionsRegular Expressions
Regular Expressions
Satya Narayana
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
Lambert Lum
 
Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions
Ahmed El-Arabawy
 
Regexp secrets
Regexp secretsRegexp secrets
Regexp secrets
Hiro Asari
 
Tutorial on Regular Expression in Perl (perldoc Perlretut)
Tutorial on Regular Expression in Perl (perldoc Perlretut)Tutorial on Regular Expression in Perl (perldoc Perlretut)
Tutorial on Regular Expression in Perl (perldoc Perlretut)
FrescatiStory
 
Eloquent Ruby chapter 4 - Find The Right String with Regular Expression
Eloquent Ruby chapter 4 - Find The Right String with Regular ExpressionEloquent Ruby chapter 4 - Find The Right String with Regular Expression
Eloquent Ruby chapter 4 - Find The Right String with Regular Expression
Kuyseng Chhoeun
 
Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressions
Ben Brumfield
 
Looking for Patterns
Looking for PatternsLooking for Patterns
Looking for Patterns
Keith Wright
 
Lecture 23
Lecture 23Lecture 23
Lecture 23
rhshriva
 
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekingeBioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Bioinformatics p2-p3-perl-regexes v2013-wim_vancriekinge
Prof. Wim Van Criekinge
 
Basta mastering regex power
Basta mastering regex powerBasta mastering regex power
Basta mastering regex power
Max Kleiner
 
Php Chapter 4 Training
Php Chapter 4 TrainingPhp Chapter 4 Training
Php Chapter 4 Training
Chris Chubb
 
Regular Expressions grep and egrep
Regular Expressions grep and egrepRegular Expressions grep and egrep
Regular Expressions grep and egrep
Tri Truong
 
Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013Introduction to Regular Expressions RootsTech 2013
Introduction to Regular Expressions RootsTech 2013
Ben Brumfield
 
PERL Regular Expression
PERL Regular ExpressionPERL Regular Expression
PERL Regular Expression
Binsent Ribera
 
Bioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introductionBioinformatica 06-10-2011-p2 introduction
Bioinformatica 06-10-2011-p2 introduction
Prof. Wim Van Criekinge
 
Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...
Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...
Bioinformatica: Esercizi su Perl, espressioni regolari e altre amenità (BMR G...
Andrea Telatin
 
Regular Expression
Regular ExpressionRegular Expression
Regular Expression
Lambert Lum
 
Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions Course 102: Lecture 13: Regular Expressions
Course 102: Lecture 13: Regular Expressions
Ahmed El-Arabawy
 
Regexp secrets
Regexp secretsRegexp secrets
Regexp secrets
Hiro Asari
 
Tutorial on Regular Expression in Perl (perldoc Perlretut)
Tutorial on Regular Expression in Perl (perldoc Perlretut)Tutorial on Regular Expression in Perl (perldoc Perlretut)
Tutorial on Regular Expression in Perl (perldoc Perlretut)
FrescatiStory
 
Eloquent Ruby chapter 4 - Find The Right String with Regular Expression
Eloquent Ruby chapter 4 - Find The Right String with Regular ExpressionEloquent Ruby chapter 4 - Find The Right String with Regular Expression
Eloquent Ruby chapter 4 - Find The Right String with Regular Expression
Kuyseng Chhoeun
 
Introduction to regular expressions
Introduction to regular expressionsIntroduction to regular expressions
Introduction to regular expressions
Ben Brumfield
 
Looking for Patterns
Looking for PatternsLooking for Patterns
Looking for Patterns
Keith Wright
 
Ad

More from Krasimir Berov (Красимир Беров) (11)

Хешове
ХешовеХешове
Хешове
Krasimir Berov (Красимир Беров)
 
Списъци и масиви
Списъци и масивиСписъци и масиви
Списъци и масиви
Krasimir Berov (Красимир Беров)
 
Скаларни типове данни
Скаларни типове данниСкаларни типове данни
Скаларни типове данни
Krasimir Berov (Красимир Беров)
 
Въведение в Perl
Въведение в PerlВъведение в Perl
Въведение в Perl
Krasimir Berov (Красимир Беров)
 
System Programming and Administration
System Programming and AdministrationSystem Programming and Administration
System Programming and Administration
Krasimir Berov (Красимир Беров)
 
Network programming
Network programmingNetwork programming
Network programming
Krasimir Berov (Красимир Беров)
 
Processes and threads
Processes and threadsProcesses and threads
Processes and threads
Krasimir Berov (Красимир Беров)
 
Working with databases
Working with databasesWorking with databases
Working with databases
Krasimir Berov (Красимир Беров)
 
IO Streams, Files and Directories
IO Streams, Files and DirectoriesIO Streams, Files and Directories
IO Streams, Files and Directories
Krasimir Berov (Красимир Беров)
 
Syntax
SyntaxSyntax
Syntax
Krasimir Berov (Красимир Беров)
 
Introduction to Perl
Introduction to PerlIntroduction to Perl
Introduction to Perl
Krasimir Berov (Красимир Беров)
 
Ad

Recently uploaded (20)

tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-UmgebungenHCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
HCL Nomad Web – Best Practices und Verwaltung von Multiuser-Umgebungen
panagenda
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 

Working with text, Regular expressions

  • 1. Perl Programming Course Working with text Regular expressions Krassimir Berov I-can.eu
  • 2. Contents 1. Simple word matching 2. Character classes 3. Matching this or that 4. Grouping 5. Extracting matches 6. Matching repetitions 7. Search and replace 8. The split operator
  • 3. Simple word matching • It's all about identifying patterns in text • The simplest regex is simply a word – a string of characters. • A regex consisting of a word matches any string that contains that word • The sense of the match can be reversed by using !~ operator my $string ='some probably big string containing just about anything in it'; print "found 'string'n" if $string =~ /string/; print "it is not about dogsn" if $string !~ /dog/;
  • 4. Simple word matching (2) • The literal string in the regex can be replaced by a variable • If matching against $_ , the $_ =~ part can be omitted: my $string ='stringify this world'; my $word = 'string'; my $animal = 'dog'; print "found '$word'n" if $string =~ /$word/; print "it is not about ${animal}sn" if $string !~ /$animal/; for('dog','string','dog'){ print "$wordn" if /$word/ }
  • 5. Simple word matching (3) • The // default delimiters for a match can be changed to arbitrary delimiters by putting an 'm' in front • Regexes must match a part of the string exactly in order for the statement to be true my $string ='Stringify this world!'; my $word = 'string'; my $animal = 'dog'; print "found '$word'n" if $string =~ m#$word#; print "found '$word' in any casen" if $string =~ m#$word#i; print "it is not about ${animal}sn" if $string !~ m($animal); for('dog','string','Dog'){ local $=$/; print if m|$animal| }
  • 6. Simple word matching (4) • perl will always match at the earliest possible point in the string my $string ='Stringify this stringy world!'; my $word = 'string'; print "found '$word' in any casen" if $string =~ m{$word}i; • Some characters, called metacharacters, are reserved for use in regex notation. The metacharacters are (14): { } [ ] ( ) ^ $ . | * + ? • A metacharacter can be matched by putting a backslash before it print "The string n'$string'n contains a DOTn" if $string =~ m|.|;
  • 7. Simple word matching (5) • Non-printable ASCII characters are represented by escape sequences • Arbitrary bytes are represented by octal escape sequences use utf8; binmode(STDOUT, ':utf8') if $ENV{LANG} =~/UTF-8/; $=$/; my $string ="containsrn Then we have sometttabs. б"; print 'matched б(x{431})' if $string =~ /x{431}/; print 'matched б' if $string =~/б/; print 'matched rn' if $string =~/rn/; print 'The string was:"' . $string.'"';
  • 8. Simple word matching (6) • To specify where it should match, use the anchor metacharacters ^ and $ . use strict; use warnings;$=$/; my $string ='A probably long chunk of text containing strings'; print 'matched "A"' if $string =~ /^A/; print 'matched "strings"' if $string =~ /strings$/; print 'matched "A", matched "strings" and something in between' if $string =~ /^A.*?strings$/;
  • 9. Character classes • A character class allows a set of possible characters, to match • Character classes are denoted by brackets [ ] with the set of characters to be possibly matched inside • The special characters for a character class are - ] ^ $ and are matched using an escape • The special character '-' acts as a range operator within character classes so you can write [0-9] and [a-z]
  • 10. Character classes • Example use strict; use warnings;$=$/; my $string ='A probably long chunk of text containing strings'; my $thing = 'ong ung ang enanything'; my $every = 'iiiiii'; my $nums = 'I have 4325 Euro'; my $class = 'dog'; print 'matched any of a, b or c' if $string =~ /[abc]/; for($thing, $every, $string){ print 'ingy brrrings nothing using: '.$_ if /[$class]/ } print $nums if $nums =~/[0-9]/;
  • 11. Character classes • Perl has several abbreviations for common character classes • d is a digit – [0-9] • s is a whitespace character – [ trnf] • w is a word character (alphanumeric or _) – [0-9a-zA-Z_] • D is a negated d – any character but a digit [^0-9] • S is a negated s; it represents any non-whitespace character [^s] • W is a negated w – any non-word character • The period '.' matches any character but "n" • The dswDSW inside and outside of character classes • The word anchor b matches a boundary between a word character and a non-word character wW or Ww
  • 12. Character classes • Example my $digits ="here are some digits3434 and then "; print 'found digit' if $digits =~/d/; print 'found alphanumeric' if $digits =~/w/; print 'found space' if $digits =~/s/; print 'digit followed by space, followed by letter' if $digits =~/ds[A-z]/;
  • 13. Matching this or that • We can match different character strings with the alternation metacharacter '|' • perl will try to match the regex at the earliest possible point in the string my $digits ="here are some digits3434 and then "; print 'found "are" or "and"' if $digits =~/are|and/;
  • 14. Extracting matches • The grouping metacharacters () also allow the extraction of the parts of a string that matched • For each grouping, the part that matched inside goes into the special variables $1 , $2 , etc. • They can be used just as ordinary variables
  • 15. Extracting matches • The grouping metacharacters () allow a part of a regex to be treated as a single unit • If the groupings in a regex are nested, $1 gets the group with the leftmost opening parenthesis, $2 the next opening parenthesis, etc. my $digits ="here are some digits3434 and then678 "; print 'found a letter followed by leters or digits":'.$1 if $digits =~/[a-z]([a-z]|d+)/; print 'found a letter followed by digits":'.$1 if $digits =~/([a-z](d+))/; # $1 $2 print 'found letters followed by digits":'.$1 if $digits =~/([a-z]+)(d+)/; # $1 $2
  • 16. Matching repetitions • The quantifier metacharacters ?, * , + , and {} allow us to determine the number of repeats of a portion of a regex • Quantifiers are put immediately after the character, character class, or grouping • a? = match 'a' 1 or 0 times • a* = match 'a' 0 or more times, i.e., any number of times • a+ = match 'a' 1 or more times, i.e., at least once • a{n,m} = match at least n times, but not more than m times • a{n,} = match at least n or more times • a{n} = match exactly n times
  • 17. Matching repetitions use strict; use warnings;$=$/; my $digits ="here are some digits3434 and then678 "; print 'found some letters followed by leters or digits":'.$1 .$2 if $digits =~/([a-z]{2,})(w+)/; print 'found three letter followed by digits":'.$1 .$2 if $digits =~/([a-z]{3}(d+))/; print 'found up to four letters followed by digits":'. $1 .$2 if $digits =~/([a-z]{1,4})(d+)/;
  • 18. Matching repetitions • Greeeedy use strict; use warnings;$=$/; my $digits ="here are some digits3434 and then678 "; print 'found as much as possible letters followed by digits":'.$1 .$2 if $digits =~/([a-z]*)(d+)/;
  • 19. Search and replace • Search and replace is performed using s/regex/replacement/modifiers. • The replacement is a Perl double quoted string that replaces in the string whatever is matched with the regex . • The operator =~ is used to associate a string with s///. • If matching against $_ , the $_ =~ can be dropped. • If there is a match, s/// returns the number of substitutions made, otherwise it returns false
  • 20. Search and replace • The matched variables $1 , $2 , etc. are immediately available for use in the replacement expression. • With the global modifier, s///g will search and replace all occurrences of the regex in the string • The evaluation modifier s///e wraps an eval{...} around the replacement string and the evaluated result is substituted for the matched substring. • s/// can use other delimiters, such as s!!! and s{}{}, and even s{}// • If single quotes are used s''', then the regex and replacement are treated as single quoted strings
  • 21. Search and replace • Example #TODO....
  • 22. The split operator • split /regex/, string splits string into a list of substrings and returns that list • The regex determines the character sequence that string is split with respect to #TODO....
  • 23. Regular expressions • Resources • perlrequick - Perl regular expressions quick start • perlre - Perl regular expressions • perlreref - Perl Regular Expressions Reference • Beginning Perl (Chapter 5 – Regular Expressions)