Perl is a high-level scripting language that is useful for text manipulation, file handling, and system administration tasks. It was created by Larry Wall in 1987 and includes powerful regular expression capabilities. Perl scripts can be executed by specifying the path to the Perl interpreter at the top of the file or by invoking Perl at the command line. It supports common programming constructs like variables, conditional statements, and subroutines.
Larry Wall created the Perl programming language in 1987. Perl is a general purpose language that is practical, flexible, and supports both procedural and object-oriented programming. It has a large community of users and third-party modules. Perl can be used for tasks like web development, system administration, and more. It has the philosophies of "there's more than one way to do it" and to "share and enjoy."
The document discusses several topics related to using Perl for bioinformatics lessons including:
- Providing lesson materials and additional educational resources online
- Benefits of using web technology like having questions answered online and avoiding repetitive questions
- Practicum details like introductions, assignments, and locations
- An introduction to the Perl programming language including its history, uses, and advantages for bioinformatics tasks
- Examples of bioinformatics problems that can be solved using Perl like sequence analysis, parsing results, and database queries
This document provides an overview of the Perl programming language. It discusses that Perl uses an interpreter to run code without compilation, and that it is easy to learn, program, and debug in Perl. The document then covers Perl installation, variables, data types, control structures, modules, and provides an example of a basic phonebook program to demonstrate hashes.
This document provides an overview of scripting and the shell. It discusses shell basics including editing commands and setting editing modes. It covers pipes and redirection, variables and quoting, common filter commands like cut, sort, uniq, and wc. It also discusses the tee, head, tail, grep, bash scripting, regular expressions, Perl programming including variables, arrays, regular expressions and input/output. Finally, it briefly introduces Python scripting.
This is the first set of slightly updated slides from a Perl programming course that I held some years ago for the QA team of a big international company.
I want to share it with everyone looking for intransitive Perl-knowledge.
The updates after 1st of June 2014 are made with the kind support of Chain Solutions (http://chainsolutions.net/)
A table of content for all presentations can be found at i-can.eu.
The source code for the examples and the presentations in ODP format are on https://github.com/kberov/PerlProgrammingCourse
This document is a tutorial on the Perl programming language. It introduces Perl and how to run Perl programs from the command line. It discusses the basics of the Perl language, including variables, scalars, arrays, and literals. It covers how to assign values to variables, different types of strings and string operators, and numbers and arithmetic operators. It also provides examples of using arrays in Perl. The overall goal of the tutorial is to provide readers with a basic understanding of the Perl language.
This document is a tutorial on the Perl programming language. It begins with a brief introduction to Perl, describing how it was created and some of its main uses and strengths. The tutorial then covers essential Perl concepts like variables, scalars, arrays, associative arrays, conditionals, and loops. It provides examples and explanations of Perl's syntax for these fundamental programming structures. The focus is on helping readers gain a basic understanding of the Perl language.
This document provides an introduction to PHP including:
- PHP code uses <?php ?> tags and semicolons to end statements. It is loosely typed and supports variables, arrays, and objects.
- Built-in variables like $_GET and $_SERVER provide access to server and request data. Strings support escape sequences and variable interpolation.
- PHP has advantages like being open source, easy to learn, and having a large community, but disadvantages include loose syntax that can cause errors and previous lack of object orientation.
This document provides an overview of shell scripting in Bash. It covers basic script syntax including the shebang line and running scripts. It discusses shell variables, control structures like for loops, and commands like echo, read, and printf for console I/O. The document also covers special variables, command line arguments, and provides exercises for students to practice shell scripting concepts.
The document introduces some simple Perl scripts to demonstrate basic Perl concepts like scalars, arrays, if/else statements, while and for loops, and print statements. It provides examples of 5 simple scripts - one that calculates the sum of numbers from 1 to 100 using a while loop, another that does the same using a for loop, a third that prints menu options from an array using a for loop, a fourth that makes a decision based on a hardcoded or user-input word, and a fifth that interacts with the user to make the same decision. It also covers comments, newline characters, short forms, syntax checking, and running the scripts.
Perl - laziness, impatience, hubris, and one linersKirk Kimmel
Perl provides tools like perldoc, cpan, and Perl::Tidy to help developers work more efficiently. One-liners allow running Perl commands and programs directly from the command line. ExtUtils::Command provides functions that emulate common shell commands to make Perl scripts more portable. Perl::Tidy can reformat code to make it more readable.
This document provides an introduction to Perl programming by discussing what Perl is used for, why it is useful, and how to get started with the language. It covers installing Perl on Windows and Linux, using variables and data structures like scalars, arrays, hashes, and references. It also demonstrates basic Perl syntax like conditional statements, loops, file I/O, and running commands. The goal is to get readers writing basic Perl code quickly while highlighting some key features of the language.
This document provides an overview of a lecture on introduction to Perl programming. It discusses installing and running Perl programs, basic data types like numbers and strings, control structures, and operators. Perl can be used for tasks like web scripting, database programming, and rapid prototyping. It has advantages like being free, portable, and object-oriented, but also drawbacks such as sometimes being difficult to read. Resources for learning more about Perl are provided.
C Language review. its review about arrays, Strings, pointers etc.. The first high-level programming languages were designed in the 1950s. Now there are dozens of
different languages, including Ada , Algol, BASIC, COBOL, C, C++, JAVA, FORTRAN, LISP,
Pascal, and Prolog. Such languages are considered high-level because they are closer to human
languages and farther from machine languages. In contrast, assembly languages are considered lowlevel because they are very close to machine languages. High-level language is any programming language that enables development of a program in much
simpler programming context and is generally independent of the computer's hardware architecture.
High-level language has a higher level of abstraction from the computer, and focuses more on the
programming logic rather than the underlying hardware components such as memory addressing and
register utilization. Machine code or machine language is a set of instructions executed directly by a computer's central
processing unit (CPU). Each instruction performs a very specific task, such as a load, a jump, or an
ALU operation on a unit of data in a CPU register or memory. Every program directly executed by a CPU is made up of a series of such instructions.
Polymorphism means the ability to take more than one form. An operation may exhibit different
instance. The behaviour depends upon the type of data used in the operation.
A language feature that allows a function or operator to be given more than one definition. The types
of the arguments with which the function or operator is called determines which definition will be
used.
Overloading may be operator overloading or function overloading.
1) The document provides examples of simple Perl scripts to demonstrate basic programming concepts like variables, arrays, conditional and loop statements, comments, and input/output.
2) The scripts include examples that add numbers in a range using a while and for loop, print array elements, make decisions based on user input, and get input from the user.
3) Comments are explained as a way to annotate code for the programmer and are denoted with # in Perl. Large blocks of code can be commented out using =comment and =cut.
PHP is a scripting language commonly used for web development. It has syntax inspired by C and Perl and allows embedding PHP code segments within HTML files. PHP code is interpreted and executed on the server side to generate dynamic web page content. Key PHP constructs include variables, data types, operators, conditional and looping control structures, and functions. PHP aims to be convenient for programmers while sometimes failing silently on errors.
The document provides instructions for preparing for a bioinformatics course on December 17th, 2014. It instructs students to install Perl and Java software by specified dates. The outline for the course covers topics like scripting with Perl and Python, working with databases and genome browsers, and using artificial intelligence tools like WEKA for classification and clustering.
Perl is a high-level scripting language useful for tasks like parsing and restructuring data files, CGI scripts, and more. It was created in 1987 by Larry Wall as a "glue" language to connect systems. Perl code is compiled at runtime. Key features include regular expressions, hashes for associative arrays, object-oriented capabilities, and extensive standard and third-party libraries. Perl uses C-like syntax and data types like scalars, lists, and hashes. It supports control structures like if/else, for loops, and subroutines for modular programming. Perl is well-suited for text manipulation and system administration tasks.
The document provides an introduction to the PHP programming language. It discusses PHP's syntax, which is inspired by C with curly braces and semicolons, and Perl with dollar signs for variables. PHP code can be embedded within HTML files. The philosophy of PHP is that it aims to be convenient for programmers. Basic PHP syntax and keywords are also covered, along with variables, strings, expressions, output, comments, and control structures like if/else statements and while loops.
The document provides an overview of shell programming and scripting. It defines what a shell is and its purpose to interface between the user and the operating system. Key points include:
- Shell programming involves grouping commands into scripts to automate tasks rather than typing them individually. Scripts use programming constructs like variables, conditionals, and loops.
- Scripts are useful for executing commands regularly without retyping, or controlling the sequence of commands based on previous results. They allow storing commands in a file to be executed as a program.
- Scripts accept arguments from the command line and positional parameters represent the arguments. Commands like read take user input. Conditionals like if-else and logical operators like &&,
This document provides an introduction and overview of PHP and MySQL. PHP is a programming language used for building dynamic web sites. It allows embedding code within HTML pages to quickly create dynamic content. PHP is processed on the server side to produce HTML results. The document outlines PHP basics like syntax, variables, strings, operators, and conditional statements. It also discusses MySQL, the most popular database used with PHP. The document concludes with exercises for users to practice basic PHP concepts.
This is the first set of slightly updated slides from a Perl programming course that I held some years ago for the QA team of a big international company.
I want to share it with everyone looking for intransitive Perl-knowledge.
The updates after 1st of June 2014 are made with the kind support of Chain Solutions (http://chainsolutions.net/)
A table of content for all presentations can be found at i-can.eu.
The source code for the examples and the presentations in ODP format are on https://github.com/kberov/PerlProgrammingCourse
This document is a tutorial on the Perl programming language. It introduces Perl and how to run Perl programs from the command line. It discusses the basics of the Perl language, including variables, scalars, arrays, and literals. It covers how to assign values to variables, different types of strings and string operators, and numbers and arithmetic operators. It also provides examples of using arrays in Perl. The overall goal of the tutorial is to provide readers with a basic understanding of the Perl language.
This document is a tutorial on the Perl programming language. It begins with a brief introduction to Perl, describing how it was created and some of its main uses and strengths. The tutorial then covers essential Perl concepts like variables, scalars, arrays, associative arrays, conditionals, and loops. It provides examples and explanations of Perl's syntax for these fundamental programming structures. The focus is on helping readers gain a basic understanding of the Perl language.
This document provides an introduction to PHP including:
- PHP code uses <?php ?> tags and semicolons to end statements. It is loosely typed and supports variables, arrays, and objects.
- Built-in variables like $_GET and $_SERVER provide access to server and request data. Strings support escape sequences and variable interpolation.
- PHP has advantages like being open source, easy to learn, and having a large community, but disadvantages include loose syntax that can cause errors and previous lack of object orientation.
This document provides an overview of shell scripting in Bash. It covers basic script syntax including the shebang line and running scripts. It discusses shell variables, control structures like for loops, and commands like echo, read, and printf for console I/O. The document also covers special variables, command line arguments, and provides exercises for students to practice shell scripting concepts.
The document introduces some simple Perl scripts to demonstrate basic Perl concepts like scalars, arrays, if/else statements, while and for loops, and print statements. It provides examples of 5 simple scripts - one that calculates the sum of numbers from 1 to 100 using a while loop, another that does the same using a for loop, a third that prints menu options from an array using a for loop, a fourth that makes a decision based on a hardcoded or user-input word, and a fifth that interacts with the user to make the same decision. It also covers comments, newline characters, short forms, syntax checking, and running the scripts.
Perl - laziness, impatience, hubris, and one linersKirk Kimmel
Perl provides tools like perldoc, cpan, and Perl::Tidy to help developers work more efficiently. One-liners allow running Perl commands and programs directly from the command line. ExtUtils::Command provides functions that emulate common shell commands to make Perl scripts more portable. Perl::Tidy can reformat code to make it more readable.
This document provides an introduction to Perl programming by discussing what Perl is used for, why it is useful, and how to get started with the language. It covers installing Perl on Windows and Linux, using variables and data structures like scalars, arrays, hashes, and references. It also demonstrates basic Perl syntax like conditional statements, loops, file I/O, and running commands. The goal is to get readers writing basic Perl code quickly while highlighting some key features of the language.
This document provides an overview of a lecture on introduction to Perl programming. It discusses installing and running Perl programs, basic data types like numbers and strings, control structures, and operators. Perl can be used for tasks like web scripting, database programming, and rapid prototyping. It has advantages like being free, portable, and object-oriented, but also drawbacks such as sometimes being difficult to read. Resources for learning more about Perl are provided.
C Language review. its review about arrays, Strings, pointers etc.. The first high-level programming languages were designed in the 1950s. Now there are dozens of
different languages, including Ada , Algol, BASIC, COBOL, C, C++, JAVA, FORTRAN, LISP,
Pascal, and Prolog. Such languages are considered high-level because they are closer to human
languages and farther from machine languages. In contrast, assembly languages are considered lowlevel because they are very close to machine languages. High-level language is any programming language that enables development of a program in much
simpler programming context and is generally independent of the computer's hardware architecture.
High-level language has a higher level of abstraction from the computer, and focuses more on the
programming logic rather than the underlying hardware components such as memory addressing and
register utilization. Machine code or machine language is a set of instructions executed directly by a computer's central
processing unit (CPU). Each instruction performs a very specific task, such as a load, a jump, or an
ALU operation on a unit of data in a CPU register or memory. Every program directly executed by a CPU is made up of a series of such instructions.
Polymorphism means the ability to take more than one form. An operation may exhibit different
instance. The behaviour depends upon the type of data used in the operation.
A language feature that allows a function or operator to be given more than one definition. The types
of the arguments with which the function or operator is called determines which definition will be
used.
Overloading may be operator overloading or function overloading.
1) The document provides examples of simple Perl scripts to demonstrate basic programming concepts like variables, arrays, conditional and loop statements, comments, and input/output.
2) The scripts include examples that add numbers in a range using a while and for loop, print array elements, make decisions based on user input, and get input from the user.
3) Comments are explained as a way to annotate code for the programmer and are denoted with # in Perl. Large blocks of code can be commented out using =comment and =cut.
PHP is a scripting language commonly used for web development. It has syntax inspired by C and Perl and allows embedding PHP code segments within HTML files. PHP code is interpreted and executed on the server side to generate dynamic web page content. Key PHP constructs include variables, data types, operators, conditional and looping control structures, and functions. PHP aims to be convenient for programmers while sometimes failing silently on errors.
The document provides instructions for preparing for a bioinformatics course on December 17th, 2014. It instructs students to install Perl and Java software by specified dates. The outline for the course covers topics like scripting with Perl and Python, working with databases and genome browsers, and using artificial intelligence tools like WEKA for classification and clustering.
Perl is a high-level scripting language useful for tasks like parsing and restructuring data files, CGI scripts, and more. It was created in 1987 by Larry Wall as a "glue" language to connect systems. Perl code is compiled at runtime. Key features include regular expressions, hashes for associative arrays, object-oriented capabilities, and extensive standard and third-party libraries. Perl uses C-like syntax and data types like scalars, lists, and hashes. It supports control structures like if/else, for loops, and subroutines for modular programming. Perl is well-suited for text manipulation and system administration tasks.
The document provides an introduction to the PHP programming language. It discusses PHP's syntax, which is inspired by C with curly braces and semicolons, and Perl with dollar signs for variables. PHP code can be embedded within HTML files. The philosophy of PHP is that it aims to be convenient for programmers. Basic PHP syntax and keywords are also covered, along with variables, strings, expressions, output, comments, and control structures like if/else statements and while loops.
The document provides an overview of shell programming and scripting. It defines what a shell is and its purpose to interface between the user and the operating system. Key points include:
- Shell programming involves grouping commands into scripts to automate tasks rather than typing them individually. Scripts use programming constructs like variables, conditionals, and loops.
- Scripts are useful for executing commands regularly without retyping, or controlling the sequence of commands based on previous results. They allow storing commands in a file to be executed as a program.
- Scripts accept arguments from the command line and positional parameters represent the arguments. Commands like read take user input. Conditionals like if-else and logical operators like &&,
This document provides an introduction and overview of PHP and MySQL. PHP is a programming language used for building dynamic web sites. It allows embedding code within HTML pages to quickly create dynamic content. PHP is processed on the server side to produce HTML results. The document outlines PHP basics like syntax, variables, strings, operators, and conditional statements. It also discusses MySQL, the most popular database used with PHP. The document concludes with exercises for users to practice basic PHP concepts.
The TRB AJE35 RIIM Coordination and Collaboration Subcommittee has organized a series of webinars focused on building coordination, collaboration, and cooperation across multiple groups. All webinars have been recorded and copies of the recording, transcripts, and slides are below. These resources are open-access following creative commons licensing agreements. The files may be found, organized by webinar date, below. The committee co-chairs would welcome any suggestions for future webinars. The support of the AASHTO RAC Coordination and Collaboration Task Force, the Council of University Transportation Centers, and AUTRI’s Alabama Transportation Assistance Program is gratefully acknowledged.
This webinar overviews proven methods for collaborating with USDOT University Transportation Centers (UTCs), emphasizing state departments of transportation and other stakeholders. It will cover partnerships at all UTC stages, from the Notice of Funding Opportunity (NOFO) release through proposal development, research and implementation. Successful USDOT UTC research, education, workforce development, and technology transfer best practices will be highlighted. Dr. Larry Rilett, Director of the Auburn University Transportation Research Institute will moderate.
For more information, visit: https://aub.ie/trbwebinars
Construction Materials (Paints) in Civil EngineeringLavish Kashyap
This file will provide you information about various types of Paints in Civil Engineering field under Construction Materials.
It will be very useful for all Civil Engineering students who wants to search about various Construction Materials used in Civil Engineering field.
Paint is a vital construction material used for protecting surfaces and enhancing the aesthetic appeal of buildings and structures. It consists of several components, including pigments (for color), binders (to hold the pigment together), solvents or thinners (to adjust viscosity), and additives (to improve properties like durability and drying time).
Paint is one of the material used in Civil Engineering field. It is especially used in final stages of construction project.
Paint plays a dual role in construction: it protects building materials and contributes to the overall appearance and ambiance of a space.
この資料は、Roy FieldingのREST論文(第5章)を振り返り、現代Webで誤解されがちなRESTの本質を解説しています。特に、ハイパーメディア制御やアプリケーション状態の管理に関する重要なポイントをわかりやすく紹介しています。
This presentation revisits Chapter 5 of Roy Fielding's PhD dissertation on REST, clarifying concepts that are often misunderstood in modern web design—such as hypermedia controls within representations and the role of hypermedia in managing application state.
This research is oriented towards exploring mode-wise corridor level travel-time estimation using Machine learning techniques such as Artificial Neural Network (ANN) and Support Vector Machine (SVM). Authors have considered buses (equipped with in-vehicle GPS) as the probe vehicles and attempted to calculate the travel-time of other modes such as cars along a stretch of arterial roads. The proposed study considers various influential factors that affect travel time such as road geometry, traffic parameters, location information from the GPS receiver and other spatiotemporal parameters that affect the travel-time. The study used a segment modeling method for segregating the data based on identified bus stop locations. A k-fold cross-validation technique was used for determining the optimum model parameters to be used in the ANN and SVM models. The developed models were tested on a study corridor of 59.48 km stretch in Mumbai, India. The data for this study were collected for a period of five days (Monday-Friday) during the morning peak period (from 8.00 am to 11.00 am). Evaluation scores such as MAPE (mean absolute percentage error), MAD (mean absolute deviation) and RMSE (root mean square error) were used for testing the performance of the models. The MAPE values for ANN and SVM models are 11.65 and 10.78 respectively. The developed model is further statistically validated using the Kolmogorov-Smirnov test. The results obtained from these tests proved that the proposed model is statistically valid.
David Boutry - Specializes In AWS, Microservices And PythonDavid Boutry
With over eight years of experience, David Boutry specializes in AWS, microservices, and Python. As a Senior Software Engineer in New York, he spearheaded initiatives that reduced data processing times by 40%. His prior work in Seattle focused on optimizing e-commerce platforms, leading to a 25% sales increase. David is committed to mentoring junior developers and supporting nonprofit organizations through coding workshops and software development.
In this paper, the cost and weight of the reinforcement concrete cantilever retaining wall are optimized using Gases Brownian Motion Optimization Algorithm (GBMOA) which is based on the gas molecules motion. To investigate the optimization capability of the GBMOA, two objective functions of cost and weight are considered and verification is made using two available solutions for retaining wall design. Furthermore, the effect of wall geometries of retaining walls on their cost and weight is investigated using four different T-shape walls. Besides, sensitivity analyses for effects of backfill slope, stem height, surcharge, and backfill unit weight are carried out and of soil. Moreover, Rankine and Coulomb methods for lateral earth pressure calculation are used and results are compared. The GBMOA predictions are compared with those available in the literature. It has been shown that the use of GBMOA results in reducing significantly the cost and weight of retaining walls. In addition, the Coulomb lateral earth pressure can reduce the cost and weight of retaining walls.
Newly poured concrete opposing hot and windy conditions is considerably susceptible to plastic shrinkage cracking. Crack-free concrete structures are essential in ensuring high level of durability and functionality as cracks allow harmful instances or water to penetrate in the concrete resulting in structural damages, e.g. reinforcement corrosion or pressure application on the crack sides due to water freezing effect. Among other factors influencing plastic shrinkage, an important one is the concrete surface humidity evaporation rate. The evaporation rate is currently calculated in practice by using a quite complex Nomograph, a process rather tedious, time consuming and prone to inaccuracies. In response to such limitations, three analytical models for estimating the evaporation rate are developed and evaluated in this paper on the basis of the ACI 305R-10 Nomograph for “Hot Weather Concreting”. In this direction, several methods and techniques are employed including curve fitting via Genetic Algorithm optimization and Artificial Neural Networks techniques. The models are developed and tested upon datasets from two different countries and compared to the results of a previous similar study. The outcomes of this study indicate that such models can effectively re-develop the Nomograph output and estimate the concrete evaporation rate with high accuracy compared to typical curve-fitting statistical models or models from the literature. Among the proposed methods, the optimization via Genetic Algorithms, individually applied at each estimation process step, provides the best fitting result.
Jacob Murphy Australia - Excels In Optimizing Software ApplicationsJacob Murphy Australia
In the world of technology, Jacob Murphy Australia stands out as a Junior Software Engineer with a passion for innovation. Holding a Bachelor of Science in Computer Science from Columbia University, Jacob's forte lies in software engineering and object-oriented programming. As a Freelance Software Engineer, he excels in optimizing software applications to deliver exceptional user experiences and operational efficiency. Jacob thrives in collaborative environments, actively engaging in design and code reviews to ensure top-notch solutions. With a diverse skill set encompassing Java, C++, Python, and Agile methodologies, Jacob is poised to be a valuable asset to any software development team.
Welcome to the May 2025 edition of WIPAC Monthly celebrating the 14th anniversary of the WIPAC Group and WIPAC monthly.
In this edition along with the usual news from around the industry we have three great articles for your contemplation
Firstly from Michael Dooley we have a feature article about ammonia ion selective electrodes and their online applications
Secondly we have an article from myself which highlights the increasing amount of wastewater monitoring and asks "what is the overall" strategy or are we installing monitoring for the sake of monitoring
Lastly we have an article on data as a service for resilient utility operations and how it can be used effectively.
2. 2
What is Perl?
• Practical Extraction and Report Language
• A scripting language which is both relatively
simple to learn and yet remarkably powerful.
3. 3
Introduction to Perl
Perl is often described as a cross between shell
programming and the C programming language.
C
(numbers)
Shell programming
(text)
Smalltalk
(objects)
C++
(numbers, objects)
Perl
(text, numbers)
Java
(objects)
4. 4
Introduction to Perl
• A “glue” language. Ideal for connecting things
together, such as a GUI to a number cruncher, or a
database to a web server.
• Has replaced shell programming as the most popular
programming language for text processing and Unix
system administration.
• Runs under all operating systems (including Windows).
• Open source, many libraries available (e.g. database,
internet)
• Extremely popular for CGI and GUI programming.
5. 5
Why use Perl ?
• It is easy to gain a basic understanding of the
language and start writing useful programs
quickly.
• There are a number of shortcuts which make
programming ‘easier’.
• Perl is popular and widely used, especially for
system administration and WWW programming.
6. 6
Why use Perl?
• Perl is free and available on all computing
platforms.
– Unix/Linux, Windows, Macintosh, Palm OS
• There are many freely available additions to Perl
(‘Modules’).
• Most importantly, Perl is designed to
understand and manipulate text.
7. 7
Where to find help!
• http://www.perl.com
• http://www.perl.org
8. 8
Your first Perl script
#!/usr/bin/perl
#This script prints a friendly greeting to the screen
print “Hello Worldn”;
• Scripts are first “compiled” and then “executed” in
the order in which the lines of code appear
• You can write a script with any text editor. The
only rule is that it must be saved as plain text.
9. 9
Running Perl Scripts
• Perl 5 is installed on our CS system.
• Run from the command line:
palazzi% which perl
/usr/bin/perl
palazzi$ perl hello.pl
Hello world!
• You can run the script directly if you make the script
executable, and the first line uses ‘hash-bang’ notation:
palazzi% chmod +x hello.pl
palazzi% hello.pl
#!/usr/bin/perl -w
print "Hello world!n";
10. 10
Basic Syntax
• The -w option tells Perl to produce extra
warning messages about potential dangers.
Always use this option- there is never (ok, rarely)
a good reason not to.
#!/usr/bin/perl -w
• White space doesn't matter in Perl (like C++),
except for #!/usr/bin/perl -w which must start
from column 1 on line 1.
11. 11
Basic Syntax
• All Perl statements end in a semicolon ; (like C)
• In Perl, comments begin with # (like shell
scripts)
– everything after the # to the end of the line is
ignored.
– # need not be at the beginning of the line.
– there are no C-like multi-line comments: /* */
12. 12
Perl Example
• Back to our “Hello World” program:
palazzi% hello.pl
#!/usr/bin/perl -w
# This is a simple Hello World! Program.
print "Hello world!n";
–The print command sends the string to the screen,
and “n“ adds a new line.
–You can optionally add parentheses:
print(Hello world!n);
13. 13
First Script Line by Line
# This script prints a friendly greeting to the
screen
• This is a Perl ‘comment’. Anything you type after a
pound sign (#) is not interpreted by the compiler.
These are notes to yourself or a future reader.
Comments start at the ‘#’ and end at a carriage return
• #!/usr/bin/perl is NOT a comment (note this exception)
14. 14
First Script Line by Line
print “Hello World!n”;
• This is a Perl ‘statement’, or line of code
• ‘print’ is a function - one of many
• “Hello World!n” is a string of characters
– note the ‘n’ is read as a single character
meaning ‘newline’
• The semicolon ‘;’ tells the interpreter that
this line of code is complete.
15. 15
Many ways to do it!
1. Welcome to Perl!
2. Welcome to Perl!
3. Welcome to Perl!
4. Welcome to Perl!
5. Welcome to Perl!
6. Welcome
to
Perl!
# welcome.pl
print ( "1. Welcome to Perl!n" );
print "2. Welcome to Perl!n" ;
print "3. Welcome ", "to ", "Perl!n";
print "4. Welcome ";
print "to Perl!n";
print "5. Welcome to Perl!n";
print "6. Welcomen tonn Perl!n";
16. 16
System Calls
• You can use Perl to execute shell
commands, just as if you were typing them
on the command line.
• Syntax:
– `command` # note that ` is the ‘backtick’
character, not the single quote ‘
17. 17
A script which uses a system call
• Note we are now using a ‘variable’ to hold
the results of our system call
#!/usr/bin/perl
$directory_listing = `ls -l .`;
print $directory_listing;
19. 19
What is a variable?
• A named container for a single value
– can be text or number
– sometimes called a ‘scalar’
• A scalar variable has the following rules
– Must start with a dollar sign ($)
– Must not start with a number
– Must not contain any spaces
– May contain ‘a’ through ‘Z’, any number character, or
the ‘_’ character
20. 20
Basic Types
• Scalars, Lists and Hashes:
– $cents=123;
– @home=(“kitchen”, ”living room”, “bedroom”);
– %days=( “Monday”=>”Mon”,
“Tuesday”=>”Tues”);
• All variable names are case sensitive.
21. 21
Scalars
• Denoted by ‘$’. Examples:
• $cents=2;
• $pi=3.141;
• $chicken=“road”;
• $name=`whoami`;
• $foo=$bar;
• $msg=“My name is $name”;
• In most cases, Perl determines the type (numeric
vs. string) on its own, and will convert
automatically, depending on context. (eg, printing
vs. multiplying)
22. 23
Scalar variable names
• These are valid names
– $variable
– $this_is_a_place_for_my_stuff
– $Xvf34_B
• These are invalid names
– $2
– $another place for my stuff
– $push-pull
– $%percent
23. 24
Variable name tips
• Use descriptive names
– $sequence is much more informative than $x
– $sequence1 is ok. $sequence_one is fine too
• Avoid using names that look like functions
– $print is probably bad (it will work!)
• Try to avoid single letter variable names
– $a and $b are used for something else
– Experienced programmers will often use $i and
$j as ‘counters’ for historical reasons.
24. 25
Operators
Operator Description Example Result
. String concatenate 'Teddy' . 'Bear' TeddyBear
= Assignment $bear = 'Teddy' $bear variable contains 'Teddy'
+ Addition 3+2 5
- Subtraction 3-2 1
- Negation -2 -2
! Not !2 0
* Multiplication 3*2 6
/ Division 3/2 1.5
% Modulus 3%2 1
** Exponentiation 3**2 9
. acts on strings only, ! on both strings and numbers, the rest
on numbers only.
25. 26
A Perl calculator
#!/usr/bin/perl
$value_one = shift; #Takes the first argument from the command line
$value_two = shift; #Takes the next argument from the command line
$sum = $value_one + $value_two;
$difference = $value_one - $value_two;
$product = $value_one * $value_two;
$ratio = $value_one / $value_two;
$power = $value_one ** $value_two;
print "The sum is: $sumn";
print "The difference is: $differencen";
print "The product is: $productn";
print "The ratio is: $ration";
print "The first number raised to the power of the second number is: $powern";
print ("I could have also written the sum as:", $value_one + $value_two, "n”);
26. 27
Quoting
• When printing, use escapes (backslash) to print special characters:
– print “She said ”Nortel cost $$cost @ $time”.”
– Output: She said “Nortel cost $0.01 @ 10:00”.
• Special chars: $,@,%,&,”
• Use single quotes to avoid interpolation:
– print ‘My email is [email protected]. Please send me $’;
– (Now you need to escape single quotes.)
• Another quoting mechanism: qq() and q()
– print qq(She said “Nortel cost $$cost @ $time”.);
– print q(My email is [email protected]. Please send me $);
– Useful for strings full of quotes.
27. 28
Backquotes: Command Substitution
• You can use command substitution in Perl like in shell
scripts:
$ whoami
bhecker
#!/usr/bin/perl -w
$user = `whoami`;
chomp($user);
$num = `who | wc -l`;
chomp($num);
print "Hi $user! There are $num users logged on.n";
$ test.pl
Hi bhecker! There are 6 users logged on.
• Command substitution will usually include a new line, so use
chomp().
29. 30
Quotes and more Quotes - Recap
• There is a fine distinction between double quoted
strings and single quoted strings:
– print “$variablen” # prints the contents of $variable
and then a newline
– print ‘$variablen’ # prints the string $variablen to the
screen
• Single quotes treat all characters as literal (no
characters are special)
• You can always specify a character to be treated
literally in a double quoted string:
– print “I really want to print a $ charactern”;
30. 31
Even more options
• the qq operator
– print qq[She said “Hi there, $stranger”.n] ; #same as
– print “She said ”Hi there, $stranger”.n” ;
• qq means change the character used to denote the
string
– Almost any non-letter character can be used, best to pick
one not in your string
• print qq$I can print this stringn$;
• print qq^Or I can print this stringn^;
• print qq &Or this onen&;
– perl thinks that if you use a ‘(‘, ‘[‘, or ‘{‘ to open the
string, you mean to use a ‘)’, ‘]’, or ‘}’ to close it
31. 32
What is Truth?
• A question debated by man since before
cave art.
• A very defined thing in PERL.
– Something is FALSE if:
• a) it evaluates to zero
• b) it evaluates to ‘’ (empty string)
• c) it evaluates to an empty list (@array = “”)
• d) the value is undefined (ie. uninitialized variable)
– Everything else is TRUE
32. 33
Numeric Comparison Operators
Operator Description Example Result
== Equality 2 == 2 TRUE
!= Non Equality 2 !=2 FALSE
> Greater Than 3>2 TRUE
< Less Than 3<2 FALSE
<= Greater Than or Equal 3<=2 TRUE
>= Less Than or Equal 3>=2 FALSE
<=> Comparison 3 <=> 2 1
" 2 <=> 3 -1
" 3 <=> 3 0
• Do not confuse ‘=‘ with ‘==‘ !!!!
•<=> is really only useful when using the ‘sort’ function
33. 34
String (Text) Comparison
Operators
•cmp is really only useful when using the ‘sort’ function
Operator Description Example Result
eq Equality 'cat' eq 'cat' TRUE
ne Non Equality 'cat' ne 'cat' FALSE
gt Greater Than 'data' gt 'cat' TRUE
lt Less Than 'data' lt 'cat' FALSE
ge Greater Than or Equal 'data' ge 'cat' TRUE
le Less Than or Equal 'data' le 'cat' FALSE
cmp Comparison 'data' cmp 'cat' 1
" 'cat' cmp 'data' -1
" 'cat' cmp 'cat' 0
34. 35
What did you mean?
• To make your life ‘easier’, Perl has only one
data type for both strings (characters) and
numbers.
• When you use something in numeric context,
Perl treats it like a number.
– $y = ‘2.0’ + ‘1’; # $y contains ‘3’
– $y = ‘cat’ + 1; # $y contains ‘1’
• When you use something in string context,
perl treats it like a string.
– $y = ‘2.0’ . ‘1’; # $y contains ‘2.01’
• In short, be careful what you ask for!!
35. 36
More Truth
• Statements can also be TRUE or FALSE, and this
is generally logical
– a) 1 == 2 - false
– b) 1 !=2 - true
– c) ‘dog’ eq ‘cat’ - false
– d) (1+56) <= (2 * 100) – true
– e) (1-1) – false! - evaluates to zero
– f) ‘0.0’ - true! Tricky.
– g) ‘0.0’ + 0 - false! Even trickier.
36. 37
Functions
• Functions are little bundles of Perl code
with names. They exist to make it easy to
do routine operations
• Most functions do what you think they do,
to find out how they work type:
– perldoc -f function_name
37. 38
A Perl Idiom - if
• if is a function which does something if a condition
is true.
– print “Number is 2” if ($number == 2);
• Of course, there is also a function that does the
opposite - unless
– print “Number isn’t 2” unless ($number == 2);
• You don’t ever need to use unless, unless you want
to...
– print “Number isn’t 2” if ($number != 2);
38. 39
More about if
• A frequent Perl construction is the
if/elsif/else construct
– if (something){ do something }
– elsif (something else) { do something }
– else { do the default thing }
• The block of code associated with the first
true condition is executed.
• Note: elsif, not elseif
39. 40
Traditional usage of if
#!/usr/bin/perl
# bigger.pl
$value = shift;
unless ($value =~ /^d+$/ ){
print “$value contains a non-digit character. Integers are all
digits!n”;
die;
}
if ($value > 100){
print “$value is bigger than 100n”;
}
elsif ($value1 >= 10){
print “$value is 10 or greater n”;
}
else {
print “$value is smaller than 10n”
}
40. 41
Control flow
if ($foo==10) {
print “foo is tenn”;
}
print “foo is ten” if ($foo==10);
if ($today eq “Tuesday”) {
print “Class at four.n”;
} elsif ($today eq “Friday”) {
print “See you at the bar.n”;
} else {
print “What’s on TV?n”;
}
41. 42
Control flow
You’ve already seen a while loop.
for loops are just like C:
for ($i=0; $i<10; $i++) {
print “i is $In”;
}
43. 44
A brief Diversion
• Get into the habit of using the -w flag
– mnemonic (Warn me when weird)
• Enables more strict error checking
– Will warn you when you try to compare strings
numerically, for example.
• Usage
– command line: ‘perl -w script.pl’
• even more diversion: ‘perl -c script.pl’ compiles but
does not run script.pl
– Or line: #!/usr/bin/perl -w
45. 46
Data flow
• Unless you say otherwise:
– Data comes in through STDIN (Standard IN)
– Data goes out through STDOUT (Standard Out)
– Errors go to STDERR (Standard Error)
• Error code contained in a ‘magic’ variable $!
46. 47
User Input
• Use <STDIN> to get input from the user:
#!/usr/bin/perl -w
print "Enter name: ";
$name = <STDIN>;
chomp ($name);
print "How many pens do you have? ";
$number = <STDIN>;
chomp($number);
print "$name has $number pen!n";
$ test.pl
Enter name: Barbara Hecker
How many pens do you have? one
Barbara Hecker has one pen.
47. 48
User Input
• <STDIN> grabs one line of input, including the new
line character. So, after:
$name = <STDIN>;
if the user typed “Barbara Hecker[ENTER]”, $name
will contain: “Barbara Heckern”.
• To delete the new line, the chomp() function takes a
scalar variable, and removes the trailing new line if
present.
• A shortcut to do both operations in one line is:
chomp($name = <STDIN>);
48. 49
Numerical Example
#!/usr/bin/perl -w
print "Enter height of rectangle: ";
$height = <STDIN>;
print "Enter width of rectangle: ";
$width = <STDIN>;
$area = $height * $width;
print "The area of the rectangle is $arean";
$ test.pl
Enter height of rectangle: 10
Enter width of rectangle: 5
The area of the rectangle is 50
$ test.pl
Enter height of rectangle: 10.1
Enter width of rectangle: 5.1
The area of the rectangle is 51.51
49. 50
An idiom - while
• while a condition is true, do a block of statements
• If you really want to know... The opposite of while is until
• The most common use of while is for reading and acting on lines of data from a file
50. 51
#while_count.pl
while ($val < 5){
print “$valn”;
$val++;
}
Usage of while
• while the condition is true ($val is less than
5), do something (print $val)
• ‘++’? Same at C/C++
52. 53
#line_count.pl
while ($val = <>){
$line++;
print “$line:t$valn”;
}
Reading (and modifying) a file
• Perl Magic! <>
– Opens the file (or files) given as arguments on
the command line
– Brings in one line of data at a time
53. 54
Filehandles
• A filehandle is a way to interact with input or
output
– ‘<>’ interacts with files on the command line
• filehandle names are simple strings with no
symbols
– I usually use all caps (SEQFILE), but that isn’t
necessary
• You must open your filehandle before using it
54. 55
Opening Filehandles
• Open a file for reading
– open NAME, “<filename”;
• This is default behavior, so you don’t actually need the ‘<‘
• Open file for writing
– open NAME, “>filename”; #open new file
• Warning: If filename already exists, it is overwritten!!
– open NAME, “>>filename”; # append to old file
55. 56
Filehandle
• Flexible coding
– I want to specify the file to open on the
command line, rather than hard coding it
$in_name = shift;
$out_name = shift;
open FILE, “<$in_name” or die “Couldn’t open $in_name for reading: $!n”;
open OUT, “>$out_name” || die“Couldn’t open $out_name for reading: $!n”;
while ($line = <FILE>){
chomp $line;
print OUT “Something about $linen }
• Usage: <$> myscript.pl inputfile outputfile
56. 57
When do I use a filehandle?
• You can get away with not using them, mostly.
– STDIN is fine (<>) and you can always capture your
STDOUT to a file with a redirect (>) on the command line.
– <$> myscript.pl file_in > file_out
• If you are using two input files for different purposes
or want more than one output file, you need
filehandles
– <> will slurp all the input files on command line!
– > on the command line will put all output to one file
57. 58
Perl as Duct Tape (the force that
glues the universe together)
• The STDOUT of one script can serve as the
STDIN of another script.
– use the pipe (‘|’) symbol to chain scripts together
• Nothing goes to the screen in between scripts
– instead, what would normally go to the screen is
redirected and made the STDIN of the next script
59. 60
A brief diversion
• strict – forces you to ‘declare’ a variable the first time
you use it.
– usage: use strict; (somewhere near the top of your script)
• declare variables with ‘my’
– usage: my $variable;
– or: my $variable = ‘value’;
• my sets the ‘scope’ of the variable. Variable exists
only within the current block of code
• use strict and my both help you to debug errors, and
help prevent mistakes.
60. 61
What is an array?
• A named container for a list of values
– can be text or number, or mix
– An array is an ordered list.
• Array names follow the same rules as scalar
variables
– No spaces
– a-Z 0-9 and ‘_’ only
– Cannot start with a number
61. 62
Making an array
• @my_array = (1,15,’cat’, 23, ‘blue’);
– Note this is a comma separated list, enclosed in
parentheses. The parentheses are very important!!
• A tricky way:
– @my_array = qw (1 15 cat blue);
• mnemonic: qw - ‘Quote Words’
• Remember no commas if you use qw!
62. 63
A picture might help
• @my_array = (1,15,’cat’, 23, ‘blue’);
• @my_array
0
1
2
3
4
1
15
‘cat’
23
‘blue’
Element # Contents
63. 64
Getting at the Array Elements
• @my_array = (5, ‘boo’, ‘16’, ‘hoo’);
• $my_array[1] contains ‘boo’
– Pay attention! The way this is written is important
• An array element is a single (scalar) value
• Starts with the $ sign (just like a scalar) not the @ sign
• Square braces indicate the array position (index, or
element number)
• Perl counts from zero!! First element is $my_array[0]
64. 65
Manipulating Array Elements
• You can do anything to an array element
that you can do to a scalar.
– $my_array[2] = ‘scary’;
• Of course you can do an assignment (=)
• list now is (5, ‘boo’, ‘scary’, ‘hoo’)
– $string = $my_array[2].$my_array[1]
• $string contains ‘scaryboo’
– $my_array[5] = ‘16’;
• list now (5, ‘boo’, ‘scary’, ‘hoo’, ‘’, ’16’)
• your list is as long as it needs to be!
65. 66
A Common Mistake
• @array is not the same as $array
– One is an array, one is a scalar.
– To get at an array element, must use square braces.
($array[$i])
– The square braces are how Perl knows you are talking
about an array
– You may have both @array and $array at the same
time. They are completely different, and not related
in any way at all.
• Since they are different, use different names and
don’t confuse yourself.
66. 67
Some useful tricks
• copy an array
– @array_copy = @array;
• join two arrays
– @array_join = (@array1,@array2);
• reverse the order of an array
– @array_flip = reverse(@array);
• print an array (simple method)
– print @array; # prints elements with no spaces
– print “@array”; # prints elements separated by single
space
67. 68
Some more useful tricks
• Getting at the last element
– $last_element = $my_array[-1];
• negative indices count backwards
• Counting the number of elements
– $count = scalar @array;
• If we use a list in a scalar context, we get the
number of elements in the list. Same as:
– $count = @array;
• In other words, if we try to use an array (list) in the
same way as a single (scalar) variable, perl makes
our array into a number.
68. 69
List or Scalar Context
• Some functions behave differently if given
a list than if given a scalar.
• An example:
– @array2 = reverse @array1;
• now @array2 contains the elements in @array1 in
reversed order - we’ve seen this already
• list context - reverse is given a list as an argument
– $reversedword = reverse $word;
• if $word contained ‘Hello’, $reversedword contains
‘olleH’
• scalar context - reverse is given a scalar as an
argument
69. 70
Visiting each item in a list
• foreach element (list){do something interesting}
#!/usr/bin/perl -w
use strict;
my @list = ('pkc','pkd', 'mapk32', 'efgr');
my $count = 1;
my $item;
foreach $item (@list){
print "Element number $count is $itemn";
$count++
}
70. 71
Some Tricky Bits (a magic variable)
• The default scalar variable - $_
• In a looping structure (foreach and while, for
example), if you don’t specify a loop variable, the
value will be assigned to $_ instead.
• In general, any function which acts on a scalar
(chomp and print, for example) will act on $_
unless told otherwise.
• It is easier to show it than to describe it...
71. 72
Visiting each item in a list –
magic $_ version
#!/usr/bin/perl -w
use strict;
my @list = ('pkc','pkd', 'mapk32', 'efgr');
my $count = 1;
foreach (@list){
print "Element number $count is $_n";
$count++
}
72. 73
Making an array from a file
#!/usr/bin/perl -w
use strict;
my @array;
while (my $line = <>){
chomp $line;
@array = (@array,$line);
# push (@array, $line); # a way we don’t know yet
}
now do something cute with @array
•Assuming each line of your file is to be a single element in
your array...
74. 75
pop and push
• Sometimes, you want to do something with the
end of a list.
– pop : removes the last element from a list
– $last_value = pop @array #or pop (@array)
– push : adds an element to the end of a list
– push @array, $value # or push (@array,’value’)
• Both push and pop change the array.
• Remember, push onto the end, pop off the end.
75. 76
shift and unshift
• Sometimes, you want to do something to the front
of a list
– shift : takes the first element off of the list
– $value = shift @array # or $value = shift(@array)
– unshift : puts an element at the front of the list
– unshift @array, $value # or unshift (@array,$value)
• shift and unshift also change the array
• Remember: shift off of the front, unshift onto the
front
76. 77
Haven’t I seen shift before?
• You may recall that we used shift to get arguments
into our script in the second class:
– my $value1 = shift; #get command line argument
• This is another example of perl using a default
variable.
• Since we didn’t specify an array, it assumed we
meant @ARGV (the invocation argument array)
– same as typing : my $value = shift @ARGV;
77. 78
Split!
• split is a very useful function
– Takes a string and splits it into an array
– You choose what character (or characters) to
split on
• split (/pattern/, string)
– where pattern is what to split on and string is
what to split
– the split function returns a list
78. 79
Using Split
• my @array = split (/s/,$string); or
my @array = split (“s”, $string); or
my @array = split “s”; or
my @array = split;
• Examples:
– split (/s/, ‘a few words’);
• returns a list containing (‘a’, ‘few’, ‘words’)
– split (/x/, ‘ABxCXxDDxxEFGx’);
• returns (‘AB’, ‘CX’, ‘DD’, ‘’, ‘EFG’)
• Note that the character you split on is ‘destroyed’ - it doesn’t
appear in your list
79. 80
Join: The anti-split
• join : takes an array as its argument, and returns a
string.
• join (glue, list);
• example: $string = join (‘glue’, @array);
– if array contained (‘foo’, 15, ‘bar’)...
– $string = ‘fooglue15gluebar’
• Whatever the ‘glue’ is will the the string in between
the array elements.
– You can (and often want to) use ‘’ as the glue
80. 81
Example: Removing embedded
new lines from a file
#!/usr/bin/perl -w
use strict;
$/ = ">"; #change the ‘record separator’ from n to the ‘>’ character
<>; # get the first record (just a ‘>’). No assignment, so it disappears!
while ($record = <>){
chomp $record;
my ($name,@seqs) = split ("n”, $record);
my $sequence = join (‘’, @seqs);
print ">$namen$sequencen";
}
81. 82
Sorting an Array
• You frequently wish to sort a list.
• Two kinds of sorting:
– Alphabetical (the default in perl)
– Numeric
• sort always takes a list as its argument, and
returns a list
– @sorted = sort(@array)
• The argument to sort can be something that
returns a list. So, you could do:
– @sort_split = sort (split (“t”,$line));
82. 83
Sorting an Array (continued)
• Default sort is actually:
– @sorted = sort {$a cmp $b} @list;
• If ‘cmp’ looks familiar, it should. Remember:
– ‘cmp’ : string comparison operator
– ‘<=>’ : numeric comparison operator
• Both return 1, 0, or -1
• It logically follows that if we want to sort a list
numerically:
– @sorted_num = sort {$a <=> $b} @list;
83. 84
More sorting
• $a and $b cannot be renamed. sort is funny that
way. Learn the magic incantation!
• How might you sort in reverse order?
– @sort_reverse = sort {$b cmp $a}@list;
– swapping the order of $a and $b changes the sort order
• You can make the sort block as complicated as you
want.
– @sort_abs = sort { abs($a) <=> abs($b) }@num;
– this sorts on the absolute value of a list of numbers
85. 86
What is a regular expression?
• A regular expression (regex) is simply a way
of describing text.
• Regular expressions are built up of small
units which can represent the type and
number of characters in the text
• Regular expressions can be very broad
(describing everything), or very narrow
(describing only one pattern).
86. 87
Why would you use a regex?
• Often you wish to test a string for the
presence of a specific character, word, or
phrase
– Examples
• “Are there any letter characters in my string?”
• “Is this a valid accession number?”
87. 88
Constructing a Regex
• Pattern starts and ends with a / /pattern/
– if you want to match a /, you need to escape it
• / (backslash, forward slash)
– you can change the delimiter to some other character, but you
probably won’t need to
• m|pattern|
• any ‘modifiers’ to the pattern go after the last /
• i : case insensitive /[a-z]/i
• o : compile once
• g : match in list context (global)
• m or s : match over multiple lines
88. 89
Looking for a pattern
• By default, a regular expression is applied to $_ (the default
variable)
– if (/a+/) {die}
• looks for one or more ‘a’ in $_
• If you want to look for the pattern in any other variable, you must
use the bind operator
– if ($value =~ /a+/) {die}
• looks for one or more ‘a’ in $value
• The bind operator is in no way similar to the ‘=‘ sign!! = is
assignment, =~ is bind.
– if ($value = /[a-z]/) {die}
• Looks for one or more ‘a’ in $_, not $value!!!
89. 90
Regular Expression Atoms
• An ‘atom’ is the smallest unit of a regular
expression.
• Character atoms
• 0-9, a-Z match themselves
• . (dot) matches everything
• [atgcATGC] : A character class (group)
• [a-z] : another character class, a through z
90. 91
More atoms
• d - All Digits
• D - Any non-Digit
• s - Any Whitespace (s, t, n)
• S - Any non-Whitespace
• w - Any Word character [a-zA-Z_0-9]
• W - Any non-Word character
91. 92
An example
• if your pattern is /ddd-dddd/
– You could match
• 555-1212
• 5512-12222
• 555-5155-55
– But not:
• 55-1212
• 555-121
• 555j-5555
92. 93
Quantifiers
• You can specify the number of times you want to
see an atom. Examples
• d* : Zero or more times
• d+ : One or more times
• d{3} : Exactly three times
• d{4,7} : At least four, and not more than seven
• d{3,} : Three or more times
• We could rewrite /ddd-dddd/ as:
– /d{3}-d{4}/
93. 94
Anchors
• Anchors force a pattern match to a certain
location
• ^ : start matching at beginning of string
• $ : start matching at end of string
• b : match at word boundary (between w and W)
• Example:
• /^ddd-dddd$/ : matches only valid phone
numbers
94. 95
Grouping
• You can group atoms together with
parentheses
• /cat+/ matches cat, catt, cattt
• /(cat)+/ matches cat, catcat, catcatcat
• Use as many sets of parentheses as you need
95. 96
Alternation
• You can specify patterns which match either
one thing or another.
– /cat|dog/ matches either ‘cat’ or ‘dog’
– /ca(t|d)og/ matches either ‘catog’ or ‘cadog’
96. 97
Precedence
• Just like with mathematical operations,
regular expressions have an order of
precedence
– Highest : Parentheses and grouping
– Next : Repetition (+,*, {4})
– Next : Sequence (/abc/)
– Lowest : Alternation ( | )
97. 98
Examples of precedence
• If we represent sequence with a ‘.
’
– in other words : /abc/ becomes /a.
b.
c/
• /a.
b*.
c/ matches abc, abbc, ac, etc.
• /a.
b.
c*/ matches ab, abcc, abccc, etc.
• /(a.
b.
c)+/ matches abc, abcabc, etc.
• /c.
a.
t|d.
o.
g/ matches cat or dog
• /(c.
a.
t)|(d.
o.
g)/ matches cat or dog
• /c.
a.
(t|d).
o.
g/ matches catog or cadog
98. 99
Variable interpolation
• You can put variables into your pattern.
– if $string = ‘cat’
• /$string/ matches ‘cat’
• /$string+/ matches ‘cat’, ‘catcat’, etc.
• /d{2}$string+/ matches ‘12cat’, ‘24catcat’, etc.
99. 100
Remembering Stuff
• Being able to match patterns is good, but
limited.
• We want to be able to keep portions of the
regular expression for later.
– Example: $string = ‘phone: 353-7236’
• We want to keep the phone number only
• Just figuring out that the string contains a phone
number is insufficient, we need to keep the number
as well.
100. 101
Memory Parentheses (pattern memory)
• Since we almost always want to keep
portions of the string we have matched,
there is a mechanism built into perl.
• Anything in parentheses within the regular
expression is kept in memory.
– ‘phone:353-7236’ =~ /^phone:(.+)$/;
• Perl knows we want to keep everything that matches
‘.+’ in the above pattern
101. 102
Getting at pattern memory
• Perl stores the matches in a series of default variables.
The first parentheses set goes into $1, second into $2,
etc.
– This is why we can’t name variables ${digit}
– Memory variables are created only in the amounts needed.
If you have three sets of parentheses, you have ($1,$2,$3).
– Memory variables are created for each matched set of
parentheses. If you have one set contained within another
set, you get two variables (inner set gets lowest number)
– Memory variables are only valid in the current scope
102. 103
An example of pattern memory
my $string = shift;
if ($string =~ /^phone:(d{3}-d{4})$/){
$phone_number = $1;
}
else {
print “Enter a phone number!n”
}
103. 104
Some tricky bits
• You can assign pattern memory directly to
your own variable names:
– ($phone) = $value =~ /^phone:(.+)$/;
• Read from right to left. Bind (apply) this pattern to
the value in $value, and assign the results to the list
on the left
– ($front,$back) = /^phone:(d{3})-(d{4})/;
• Bind this pattern to $_ (!!!) and assign the results to
the list on the left
104. 105
List or scalar context?
• A pattern match returns 1 or 0 (true or false) in a
scalar context, and a list of matches in array
context.
• There are a lot of functions that do different things
depending on whether they are used in scalar or
list context.
• $count = @array # returns the number of elements
• $revString = reverse $string # returns a reversed string
• @revArray = reverse @array # returns a reversed list
105. 106
Practical Example of Context
• $phone = $string =~ /^.+:(.+)$/;
– $phone contains 1 if pattern matches, 0 if not
– scalar context!!!
– This is why this worked!
unless (/^d+$/){
die}
• ($phone) = $string =~ /^.+:(.+)$/;
– $phone contains the matched string
– list context!!!
106. 107
Finding all instances of a match
• Use the ‘g’ modifier to the regular expression
– @sites = $sequence =~ /(TATTA)/g;
– think g for global
– Returns a list of all the matches (in order), and
stores them in the array
– If you have more than one pair of parentheses,
your array gets values in sets
• ($1,$2,$3,$1,$2,$3...)
107. 108
Perl is Greedy
• In addition to taking all your time, perl regular
expressions also try to match the largest possible string
which fits your pattern
– /ga+t/ matches gat, gaat, gaaat
– ‘Doh! No doughnuts left!’ =~ /(d.+t)/
• $1 contains ‘doughnuts left’
• If this is not what you wanted to do, use the ‘?’ modifier
– /(d.+t)/ # match as few ‘.’s as you can and still make
the pattern work
108. 109
Making parenthesis forgetful
• Sometimes you need parenthesis to make your regex
work, but you don’t actually want to keep the results.
You can still use parentheses for grouping.
• /(?:group)/
– yet another instance of character reuse.
• d? means 0 or 1 instances
• d+? means the fewest non zero number of digits (don’t be
greedy)
• (?:group) means look for the group of atoms in the string,
but don’t remember it.
109. 110
Substitute function
• s/pattern1/pattern2/;
• Looks kind of like a regular expression
– Patterns constructed the same way
• Inherited from previous languages, so it can
be a bit different.
– Changes the variable it is bound to!
110. 111
Using s
• Substituting one word for another
– $string =~ s/dogs/cats/;
• If $string was “I love dogs”, it is now “I love cats”
• Removing trailing white space
– $string =~ s/s+$//;
• If $string was ‘ATG ‘, it is now ‘ATG’
• Adding 10 to every number in a string
– $string =~ /(d+)/$1+10/ge;
• If string was “I bought 5 dogs at 2 bucks each”, it is now:
– “I bought 15 dogs at 12 bucks each”
• Note pattern memory!!
• g means global (just like a regex)
• e is special to s, evaluate the expression on the right
111. 112
tr function
• translate or transliterate
• tr/characterlist1/characterlist2/;
• Even less like a regular expression than s
• substitutes characters in the first list with
characters in the second list
$string =~ tr/a/A/; # changes every ‘a’ to an ‘A’
– No need for the g modifier when using tr.
112. 113
Using tr
• Creating complimentary DNA sequence
– $sequence =~ tr/atgc/TACG/;
• Sneaky Perl trick for the day
– tr does two things.
• 1. changes characters in the bound variable
• 2. Counts the number of times it does this
– Super-fast character counter™
• $a_count = $sequence =~ tr/a/a/;
• replaces an ‘a’ with an ‘a’ (no net change), and assigns the
result (number of substitutions) to $a_count
114. 115
• A module is basically a collection of subroutines (and
sometimes variables) that increases the abilities of Perl
• Often, modules are put together by other people, and
distributed for public use
• Two types of modules:
– Standard (built in): Modules which are so useful (or
popular) that they are included with the standard
distributions of Perl
– Custom installed : Modules which are added to a
distribution of perl by an end user
What is a Module?
115. 116
• The File::Basename module (imports functions)
#!/usr/bin/perl
use strict;
use File::Basename;
my $path = ‘/disk2/gcg/users/seqs.fsa’;
my $file = basename($path);
my $dir = dirname($path);
print “The file name is $file in the directory $dirn”;
Using a module (example)
116. 117
• The Env module (imports variables)
#!/usr/bin/perl –w
use strict;
use Env;
print “My home is $HOMEn”;
print “My path is $PATHn”;
print “My username is $USERn”;
Using another Module
117. 118
Using A Module
• Modules are as different as the people who write
them.
• A good module will have good documentation, with
examples
• perldoc ModuleName will get you the
documentation
• You may see object oriented syntax with arrows
– $record = new->($param);
118. 119
Where do I get modules?
• Many modules are already installed with your
distribution of Perl
• If you are in doubt, try to look at the
documentation, if a module is installed you will
be able to read the docs.
• All public modules are available through
CPAN (Comprehensive Perl Archive Network)
www.CPAN.org
119. 120
Getting data from the web
• Problem: Everybody posts data on the web, nobody
knows how to get it off easily.
• Problem: Cutting and pasting from web pages is
unsatisfying, and hard on the hands and wrists
• Problem: You want the most up to date
information from a web resource
• Answer: Create a Perl script which acts as your
agent on the web (a ‘Robot’)
120. 121
Before you become a Robot...
• As with all power, this power can be used for good, or for evil
• If you plan on getting a lot of data, consider the possibility
that there may be another (easier to use) source of the data
• It is considered rude to request very large amounts of data, or
to request at a frequency which denies the resource to other
users
• This technology can be used to mount DOS (denial of service)
attacks. Don’t do this, even by accident
• The website administrator may, without your permission, cut
you off in self defense. Or cut off your entire university.
Don’t be the idiot who ruins it for everybody.
121. 122
Baby Steps: Beginning Robotics
• Unfortunately, you need to know a little about
how HTML is written and deciphered. This is
learned through practice and by looking at
examples
• Almost everything you will want to do in a
scripting languages can be accomplished by
using a simple Perl module.
• There are more powerful and (potentially
deceptive) things that can be done with all sorts
of Perl modules.
122. 123
The ‘Static’ URL Request
• Some resources are ‘static’ pages, which present
the same data on each request
(http://www.csuhayward.edu).
• Each web page has an address (URL – Uniform
Resource Locator), which uniquely identifies it on
the internet
• Static pages are easy to collect data from, since
they don’t change from request to request
123. 124
Constructing the Robot
• Now that we know the URL, we can mimic human
interaction with the web resource using Perl
• We do four relatively simple things
– 1. Construct a text string which looks like a valid
request
– 2. Use LWP::Simple to submit this text string as a web
request
– 3. Retrieve the web page as a single text string (record)
– 4. Get the information we desire out of the record.
124. 125
Using Modules
• Some handy modules:
– FileHandle (more intuitive filehandle library)
– LWP::Simple (simple web ops – page fetching,
etc).
– XML::RSS (an RSS/RDF parser).
– Date::Tolkien::Shire (do date manipulation in the
Shire calendar.)
– Thousands more..
125. 126
What is LWP::Simple
• It is a set of Perl modules which provides a
simple and consistent application
programming interface (API) to the World-
Wide Web. The main focus of the library is
to provide classes and functions that allow
you to write WWW clients.
• The library also contain modules that are of
more general use and even classes that help
you implement simple HTTP servers.
126. 127
Constructing the Robot (example)
#!/usr/bin/perl –w
use strict;
use LWP::Simple; # tell Perl we want LWP::Simple functions
# Create a string which looks like a valid URL
my $URL_string = ‘http://www.csuhayward.edu/”;
# Use the LWP::Simple ‘get’ function to request the page
my $results = get($URL_string);
print $results;
127. 128
The ‘dynamic’ URL Request
• Some online resources present different content, based on
user input. They are ‘dynamic’, in the sense that they
change their output based on a response to user input.
• Most of these online resources interact with the end user
through CGI (Common Gateway Interface) scripts, which
are often written in Perl.
• Regardless of the scripting language, CGI scripts get user
input through parameters, and these parameters are passed
through the URL request.
• You have to know what this request looks like, in order to
properly pose as a human user.
128. 129
The Request (Decoded)
• Often, you can see what your request looks like
right in your browser.
• http://www.ncbi.gov/UniGene/clust.cgi?ORG=Mm&CID=7
• Everything up to the ‘?’ character is the URL
• In this case, ‘clust.cgi’ is the name of the script
which processes the web request
• Everything after the ‘?’ are parameters passed to
the script
– Parameter ‘ORG’ = Mm
– Parameter ‘CID’ = 7
129. 130
Constructing the ‘Dynamic’ Robot
• Now that we know the URL and the parameters it
is expecting, we can mimic human interaction with
the web resource using Perl
• We do the same four relatively simple things
– 1. Construct a text string which looks like a valid
request
– 2. Use LWP::Simple to submit this text string as a web
request
– 3. Retrieve the web page as a single text string (record)
– 4. Get the information we desire out of the record.
130. 131
Phase 1: Construct the request
string
• 1. Decide which parameters are going to change,
and make them into variables.
my $URL_front =
‘http://www.ncbi.nlm.nih.gov/UniGene/clust.cgi?
ORG=Mm&CID=’;
my $cluster = shift;
chomp $cluster;
my $request = $URL_front.$cluster;
131. 132
Phase 2 and 3: Make the request
and save the results
use LWP::Simple;
# LWP::Simple is part of the standard Perl installation
my $record = get($request);
# get is the function from LWP::Simple that does the work
132. 133
Phase 4: Interpreting the results
• In order to get rid of all of the extra junk, you
need to ‘parse’ your results.
• Parsing is a fancy word for a process which
involves:
– 1. Understanding the structure of the string (where
are all of the relevant parts?)
– 2. Constructing some way to uniquely identify the
parts you want (regular expressions are good...)
– 3. Yanking out the parts you want and returning
them in some useful format.
133. 134
Get and Post
• There are two basic methods for passing
parameters over the web.
• Get : puts the parameters into the URL, you
can see them in your browser address bar
• Post : hides the parameter list from your
address bar
• Obviously a ‘get’ request is easier for you, the
novice roboteer, to interpret and act on
134. 135
Figuring out Post parameters
• Post requests are harder. Unfortunately,
there is no really easy way to figure them
out
• Look at the source for the page
• In particular, look for a section that says
something like <form
action=‘scriptname’>
• In this section are all the parameters that
a particular script accepts, and probably
some other neat information