SlideShare a Scribd company logo
Regular Expressions for the
Web Application Developer
             By Andrew Kandels
Regular Expressions
Regular expressions provide a concise, flexible means for
matching strings of text, such as words or patterns of
characters.
POSIX                                 PCRE
Portable Operating System Interface   Perl Compatible Regular Expressions


• Traditional Unix regular            •   Perl 5 Extended Features
  expression syntax                   •   Native C Extension
                                      •   Generally Faster
• PHP’s ereg_ functions               •   Optimization Qualifiers

• Basic and extended versions Used by:
                              • Programming languages
                              • Apache and other servers
Why Use Them?
•   Input Validation
•   Input Filtering
•   Search and Replace
•   Parsing and Data Extraction
•   Dynamic Recursion
•   Automation
In PHP, POSIX = Deprecated
ereg_* functions are now deprecated in newer versions of
PHP.
Switching to preg_* is generally pain free. Pain points:

•   Different matching criteria (greed)
•   preg_* requires delimiters
•   Different characters require escape sequences
•   preg favors option modifiers over functions
Anatomy of a PHP Regular Expression


                           /foo/i
• Delimiters
• Pattern to match
• Options/modifiers
preg_replace(
   „/(href|src)=„([^‟])*‟/i‟,
   „1=“2”‟,
   $str
);
PHP Regular Expressions

• Must use a delimiter: ! @ # /
• Use PHP’s single quotes (no escaping ’s)

preg_match                      Match against a pattern and
                                extract text
preg_replace                    Like str_replace with a pattern
                                (and sub-patterns)
preg_match_all                  Like preg_match, but an array
                                and count for every match
preg_split                      Like explode() but with a
                                pattern
preg_quote                      Escapes text for use in a regular
                                expression
Modifiers and Options
i   PCRE_CASELESS – Ignores case

m   PCRE_MULTILINE – Ignores new-lines

s   PCRE_DOTALL – New lines count with dots
    (.)
U   Don’t be greedy
Performance Killers

Slow-downs in performance generally come from:

• Alternation, the pipe/OR operator (|)
  Use [abcd] when possible over (a|b|c|d)
• Multi-line (PCRE_DOTALL or /s)
• Recursion: (d+)d*
  Use lengths when possible

It’s not that slow!
Sub-Patterns

Sub-Patterns allow you to extract relevant text from searches:




• For preg_replace, use either 1 or $1 in your replacement string
• Sub-patterns are left-most indexed by first left parenthesis “(“
Named Sub-Patterns




(?P<name>pattern)
Lookaheads
Are zero-match so they won’t modify your cursor or be included in any sub-patterns.




                            (?=pattern)
                   Pattern can be any valid regex
Lookbehinds




   (?<!pattern)
Accepts some basic regex
Multi-Line Processing




                     /msU
(Multi-line, include newlines with dots, non-greedy)
Once-Only Sub-Patterns

Eliminates slow recursion from wildcard searching.




       Less scans = more speed.
Greedy

By default, PCRE returns the biggest match.




        100,000 runs took 0.2791 seconds
Non-Greedy with Modifier

The /U modifier returns the SMALLEST match.




       100,000 runs took 0.2638 seconds
               (a little better, and it’s right)
Restrictive Wild-Carding

No greedy flag needed, faster without broad wild-cards.




         100,000 runs took 0.2271 seconds
                (fastest yet, no options needed)
grep

Use grep –E or egrep for extended regular expressions (+, ?, |)
and advanced functionality.

-A n         Print the next n lines after each match.
-B n         Print the previous n lines before each match.
-i           Ignore case
-m n         Stop after n matches
-r           Recursively search the file system
-n           Show line numbers
-v           Only show lines that don’t match
sed

Use –r (-E on OS X / FreeBSD) for extended regular expressions.
The End

  Web: http://andrewkandels.com

  Mail: mailto:akandels@gmail.com

Twitter: @andrewkandels
Ad

More Related Content

What's hot (20)

JavaScript: Variables and Functions
JavaScript: Variables and FunctionsJavaScript: Variables and Functions
JavaScript: Variables and Functions
Jussi Pohjolainen
 
Form using html and java script validation
Form using html and java script validationForm using html and java script validation
Form using html and java script validation
Maitree Patel
 
PHP Loops and PHP Forms
PHP  Loops and PHP FormsPHP  Loops and PHP Forms
PHP Loops and PHP Forms
M.Zalmai Rahmani
 
Php basics
Php basicsPhp basics
Php basics
Jamshid Hashimi
 
Lecture 1 - Lexical Analysis.ppt
Lecture 1 - Lexical Analysis.pptLecture 1 - Lexical Analysis.ppt
Lecture 1 - Lexical Analysis.ppt
NderituGichuki1
 
Pattern matching
Pattern matchingPattern matching
Pattern matching
shravs_188
 
Loops PHP 04
Loops PHP 04Loops PHP 04
Loops PHP 04
Spy Seat
 
PHP Form Validation Technique
PHP Form Validation TechniquePHP Form Validation Technique
PHP Form Validation Technique
Morshedul Arefin
 
Awt, Swing, Layout managers
Awt, Swing, Layout managersAwt, Swing, Layout managers
Awt, Swing, Layout managers
swapnac12
 
Xml namespace
Xml namespaceXml namespace
Xml namespace
GayathriS578276
 
Java Stack Data Structure.pptx
Java Stack Data Structure.pptxJava Stack Data Structure.pptx
Java Stack Data Structure.pptx
vishal choudhary
 
Php.ppt
Php.pptPhp.ppt
Php.ppt
Nidhi mishra
 
Java Exception handling
Java Exception handlingJava Exception handling
Java Exception handling
kamal kotecha
 
PHP - DataType,Variable,Constant,Operators,Array,Include and require
PHP - DataType,Variable,Constant,Operators,Array,Include and requirePHP - DataType,Variable,Constant,Operators,Array,Include and require
PHP - DataType,Variable,Constant,Operators,Array,Include and require
TheCreativedev Blog
 
Introduction to Javascript
Introduction to JavascriptIntroduction to Javascript
Introduction to Javascript
Amit Tyagi
 
Attributes of output Primitive
Attributes of output Primitive Attributes of output Primitive
Attributes of output Primitive
SachiniGunawardana
 
Final keyword in java
Final keyword in javaFinal keyword in java
Final keyword in java
Hitesh Kumar
 
Lexical analyzer generator lex
Lexical analyzer generator lexLexical analyzer generator lex
Lexical analyzer generator lex
Anusuya123
 
Looping statement
Looping statementLooping statement
Looping statement
ilakkiya
 
Linked list
Linked listLinked list
Linked list
akshat360
 
JavaScript: Variables and Functions
JavaScript: Variables and FunctionsJavaScript: Variables and Functions
JavaScript: Variables and Functions
Jussi Pohjolainen
 
Form using html and java script validation
Form using html and java script validationForm using html and java script validation
Form using html and java script validation
Maitree Patel
 
Lecture 1 - Lexical Analysis.ppt
Lecture 1 - Lexical Analysis.pptLecture 1 - Lexical Analysis.ppt
Lecture 1 - Lexical Analysis.ppt
NderituGichuki1
 
Pattern matching
Pattern matchingPattern matching
Pattern matching
shravs_188
 
Loops PHP 04
Loops PHP 04Loops PHP 04
Loops PHP 04
Spy Seat
 
PHP Form Validation Technique
PHP Form Validation TechniquePHP Form Validation Technique
PHP Form Validation Technique
Morshedul Arefin
 
Awt, Swing, Layout managers
Awt, Swing, Layout managersAwt, Swing, Layout managers
Awt, Swing, Layout managers
swapnac12
 
Java Stack Data Structure.pptx
Java Stack Data Structure.pptxJava Stack Data Structure.pptx
Java Stack Data Structure.pptx
vishal choudhary
 
Java Exception handling
Java Exception handlingJava Exception handling
Java Exception handling
kamal kotecha
 
PHP - DataType,Variable,Constant,Operators,Array,Include and require
PHP - DataType,Variable,Constant,Operators,Array,Include and requirePHP - DataType,Variable,Constant,Operators,Array,Include and require
PHP - DataType,Variable,Constant,Operators,Array,Include and require
TheCreativedev Blog
 
Introduction to Javascript
Introduction to JavascriptIntroduction to Javascript
Introduction to Javascript
Amit Tyagi
 
Attributes of output Primitive
Attributes of output Primitive Attributes of output Primitive
Attributes of output Primitive
SachiniGunawardana
 
Final keyword in java
Final keyword in javaFinal keyword in java
Final keyword in java
Hitesh Kumar
 
Lexical analyzer generator lex
Lexical analyzer generator lexLexical analyzer generator lex
Lexical analyzer generator lex
Anusuya123
 
Looping statement
Looping statementLooping statement
Looping statement
ilakkiya
 

Similar to Regular Expressions in PHP (20)

Spsl II unit
Spsl   II unitSpsl   II unit
Spsl II unit
Sasidhar Kothuru
 
09 string processing_with_regex copy
09 string processing_with_regex copy09 string processing_with_regex copy
09 string processing_with_regex copy
Shay Cohen
 
PHP Web Programming
PHP Web ProgrammingPHP Web Programming
PHP Web Programming
Muthuselvam RS
 
Modern C++
Modern C++Modern C++
Modern C++
Richard Thomson
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
Sujith Kumar
 
c & c++ logic building concepts practice.pptx
c & c++ logic building concepts practice.pptxc & c++ logic building concepts practice.pptx
c & c++ logic building concepts practice.pptx
rawatsatish0327
 
Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014
Sandy Smith
 
9780538745840 ppt ch03
9780538745840 ppt ch039780538745840 ppt ch03
9780538745840 ppt ch03
Terry Yoast
 
Presentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel ProgrammingPresentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel Programming
Vengada Karthik Rangaraju
 
Programming in Computational Biology
Programming in Computational BiologyProgramming in Computational Biology
Programming in Computational Biology
AtreyiB
 
Finaal application on regular expression
Finaal application on regular expressionFinaal application on regular expression
Finaal application on regular expression
Gagan019
 
C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...
C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...
C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...
corehard_by
 
Parallelism in sql server
Parallelism in sql serverParallelism in sql server
Parallelism in sql server
Enrique Catala Bañuls
 
introduction to server-side scripting
introduction to server-side scriptingintroduction to server-side scripting
introduction to server-side scripting
Amirul Shafeeq
 
Regexes in .NET
Regexes in .NETRegexes in .NET
Regexes in .NET
Pablo Fernandez Duran
 
Python Programming Basics for begginners
Python Programming Basics for begginnersPython Programming Basics for begginners
Python Programming Basics for begginners
Abishek Purushothaman
 
Bioinformatica p2-p3-introduction
Bioinformatica p2-p3-introductionBioinformatica p2-p3-introduction
Bioinformatica p2-p3-introduction
Prof. Wim Van Criekinge
 
Bioinformatics v2014 wim_vancriekinge
Bioinformatics v2014 wim_vancriekingeBioinformatics v2014 wim_vancriekinge
Bioinformatics v2014 wim_vancriekinge
Prof. Wim Van Criekinge
 
strings in php how to use different data types in string
strings in php how to use different data types in stringstrings in php how to use different data types in string
strings in php how to use different data types in string
vishal choudhary
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
Paolo Carrasco Mori
 
09 string processing_with_regex copy
09 string processing_with_regex copy09 string processing_with_regex copy
09 string processing_with_regex copy
Shay Cohen
 
Regular expressions in Python
Regular expressions in PythonRegular expressions in Python
Regular expressions in Python
Sujith Kumar
 
c & c++ logic building concepts practice.pptx
c & c++ logic building concepts practice.pptxc & c++ logic building concepts practice.pptx
c & c++ logic building concepts practice.pptx
rawatsatish0327
 
Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014Don't Fear the Regex - CapitalCamp/GovDays 2014
Don't Fear the Regex - CapitalCamp/GovDays 2014
Sandy Smith
 
9780538745840 ppt ch03
9780538745840 ppt ch039780538745840 ppt ch03
9780538745840 ppt ch03
Terry Yoast
 
Presentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel ProgrammingPresentation on Shared Memory Parallel Programming
Presentation on Shared Memory Parallel Programming
Vengada Karthik Rangaraju
 
Programming in Computational Biology
Programming in Computational BiologyProgramming in Computational Biology
Programming in Computational Biology
AtreyiB
 
Finaal application on regular expression
Finaal application on regular expressionFinaal application on regular expression
Finaal application on regular expression
Gagan019
 
C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...
C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...
C++ CoreHard Autumn 2018. Text Formatting For a Future Range-Based Standard L...
corehard_by
 
introduction to server-side scripting
introduction to server-side scriptingintroduction to server-side scripting
introduction to server-side scripting
Amirul Shafeeq
 
Python Programming Basics for begginners
Python Programming Basics for begginnersPython Programming Basics for begginners
Python Programming Basics for begginners
Abishek Purushothaman
 
strings in php how to use different data types in string
strings in php how to use different data types in stringstrings in php how to use different data types in string
strings in php how to use different data types in string
vishal choudhary
 
Ad

Recently uploaded (20)

tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
TrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business ConsultingTrsLabs - Fintech Product & Business Consulting
TrsLabs - Fintech Product & Business Consulting
Trs Labs
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
Procurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptxProcurement Insights Cost To Value Guide.pptx
Procurement Insights Cost To Value Guide.pptx
Jon Hansen
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Ad

Regular Expressions in PHP

  • 1. Regular Expressions for the Web Application Developer By Andrew Kandels
  • 2. Regular Expressions Regular expressions provide a concise, flexible means for matching strings of text, such as words or patterns of characters. POSIX PCRE Portable Operating System Interface Perl Compatible Regular Expressions • Traditional Unix regular • Perl 5 Extended Features expression syntax • Native C Extension • Generally Faster • PHP’s ereg_ functions • Optimization Qualifiers • Basic and extended versions Used by: • Programming languages • Apache and other servers
  • 3. Why Use Them? • Input Validation • Input Filtering • Search and Replace • Parsing and Data Extraction • Dynamic Recursion • Automation
  • 4. In PHP, POSIX = Deprecated ereg_* functions are now deprecated in newer versions of PHP. Switching to preg_* is generally pain free. Pain points: • Different matching criteria (greed) • preg_* requires delimiters • Different characters require escape sequences • preg favors option modifiers over functions
  • 5. Anatomy of a PHP Regular Expression /foo/i • Delimiters • Pattern to match • Options/modifiers preg_replace( „/(href|src)=„([^‟])*‟/i‟, „1=“2”‟, $str );
  • 6. PHP Regular Expressions • Must use a delimiter: ! @ # / • Use PHP’s single quotes (no escaping ’s) preg_match Match against a pattern and extract text preg_replace Like str_replace with a pattern (and sub-patterns) preg_match_all Like preg_match, but an array and count for every match preg_split Like explode() but with a pattern preg_quote Escapes text for use in a regular expression
  • 7. Modifiers and Options i PCRE_CASELESS – Ignores case m PCRE_MULTILINE – Ignores new-lines s PCRE_DOTALL – New lines count with dots (.) U Don’t be greedy
  • 8. Performance Killers Slow-downs in performance generally come from: • Alternation, the pipe/OR operator (|) Use [abcd] when possible over (a|b|c|d) • Multi-line (PCRE_DOTALL or /s) • Recursion: (d+)d* Use lengths when possible It’s not that slow!
  • 9. Sub-Patterns Sub-Patterns allow you to extract relevant text from searches: • For preg_replace, use either 1 or $1 in your replacement string • Sub-patterns are left-most indexed by first left parenthesis “(“
  • 11. Lookaheads Are zero-match so they won’t modify your cursor or be included in any sub-patterns. (?=pattern) Pattern can be any valid regex
  • 12. Lookbehinds (?<!pattern) Accepts some basic regex
  • 13. Multi-Line Processing /msU (Multi-line, include newlines with dots, non-greedy)
  • 14. Once-Only Sub-Patterns Eliminates slow recursion from wildcard searching. Less scans = more speed.
  • 15. Greedy By default, PCRE returns the biggest match. 100,000 runs took 0.2791 seconds
  • 16. Non-Greedy with Modifier The /U modifier returns the SMALLEST match. 100,000 runs took 0.2638 seconds (a little better, and it’s right)
  • 17. Restrictive Wild-Carding No greedy flag needed, faster without broad wild-cards. 100,000 runs took 0.2271 seconds (fastest yet, no options needed)
  • 18. grep Use grep –E or egrep for extended regular expressions (+, ?, |) and advanced functionality. -A n Print the next n lines after each match. -B n Print the previous n lines before each match. -i Ignore case -m n Stop after n matches -r Recursively search the file system -n Show line numbers -v Only show lines that don’t match
  • 19. sed Use –r (-E on OS X / FreeBSD) for extended regular expressions.
  • 20. The End Web: http://andrewkandels.com Mail: mailto:[email protected] Twitter: @andrewkandels