Parsing in Perl
Parsing in Perl
Alberto Simões
[email protected]
YAPC::EU::2006
Parsing...
what it is...
Parsing...
what it is...
Parsing...
what it is...
Parsing...
what it is...
Parsing...
what it is...
Parsing...
what it is...
Forget Wikipedia!
yes!
RegExp are good for tokens;
RegExps are good for regular expressions :-)
no!
most real grammars can’t be parsed with RegExps;
yes!
RegExp are good for tokens;
RegExps are good for regular expressions :-)
no!
most real grammars can’t be parsed with RegExps;
yes!
RegExp are good for tokens;
RegExps are good for regular expressions :-)
no!
most real grammars can’t be parsed with RegExps;
yes!
RegExp are good for tokens;
RegExps are good for regular expressions :-)
no!
most real grammars can’t be parsed with RegExps;
yes!
RegExp are good for tokens;
RegExps are good for regular expressions :-)
no!
most real grammars can’t be parsed with RegExps;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
Typically:
flex for lexical analysis
(re2c for thread-safe and reentrancy);
bison for syntactic analysis
(lemon for thread-safe and reentrancy);
but that is for C;
Perl 5 has lexical analysis (RegExps);
Perl 5 doesn’t have Grammar Support;
but we have CPAN!;
Parse::RecDescent;
Parse::Yapp;
Parse::YALALR;
Perl 6 will have Grammar Support (Hurray!)
PGE — Parrot Grammar Engine;
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as a
baseline of efficiency.
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as a
baseline of efficiency.
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as a
baseline of efficiency.
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as a
baseline of efficiency.
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as a
baseline of efficiency.
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as a
baseline of efficiency.
flex + bison;
re2c + lemon;
Parse::RecDescent;
Parse::YAPP;
flex + Parse::YAPP;
Parrot Grammar Engine
flex+bison and re2c+lemon will appear just at the end, as a
baseline of efficiency.
Number: /+
./
Var: /[a-z]+/
};
Well, I had a cheat version, but it made the test program a lot
slower than it is at the moment.
Well, I had a cheat version, but it made the test program a lot
slower than it is at the moment.
Well, I had a cheat version, but it made the test program a lot
slower than it is at the moment.
x809F49D:Perl_safesysmal
6M
x809F54B:Perl_safesysrea
4M
2M heap-admin
0M
0.0 20000.0
40000.0
60000.0
80000.0
100000.0
120000.0
140000.0
160000.0
180000.0
200000.0
220000.0
240000.0 ms
our %VAR;
my $p = new Calc();
undef $/;
my $File = <STDIN>;
$p->YYParse( yylex => \&yylex,
yyerror => \&yyerror);
sub yylex{
for($File){
1 while (s!^(\s+|\n)!!g); # Advance spaces
return ("","") if $_ eq ""; # EOF
# Tokens
s!^(\d+)!! and return ("Number", $1);
s!^print!! and return ("Print", "print");
s!^([a-z]+)!! and return ("Var", $1);
# Operators
s!([;+-=])!! and return ($1,$1);
1,200k
x809F49D:Perl_safesysmal
1,000k
800k
heap-admin
600k
400k
x809F54B:Perl_safesysrea
200k
0k
0.0 20000.0 40000.0 60000.0 ms
%{
#define YY_DECL char* yylex() void;
%}
char buffer[15];
%%
"print" { return strcpy(buffer, "Print"); }
[0-9]+ { return strcpy(buffer, "Number"); }
[a-z]+ { return strcpy(buffer, "Var"); }
\n { }
" " { }
. { return strcpy(buffer, yytext); }
%%
NO!
you need XS glue code;
you need some Perl glue code;
you need a decent makefile;
NO!
you need XS glue code;
you need some Perl glue code;
you need a decent makefile;
NO!
you need XS glue code;
you need some Perl glue code;
you need a decent makefile;
NO!
you need XS glue code;
you need some Perl glue code;
you need a decent makefile;
NO!
you need XS glue code;
you need some Perl glue code;
you need a decent makefile;
NO!
you need XS glue code;
you need some Perl glue code;
you need a decent makefile;
NO!
you need XS glue code;
you need some Perl glue code;
you need a decent makefile;
NO!
you need XS glue code;
you need some Perl glue code;
you need a decent makefile;
x809F49D:Perl_safesysmal
600k
heap-admin
400k
x4032CAF:perl_yyalloc
200k
x809F54B:Perl_safesysrea
0k
0.0 2000.0 4000.0 6000.0 8000.0 10000.012000.014000.016000.018000.020000.022000.0 ms
rule statement {
| print <expression> ; {{ $I0 = match[’expression’];
print $I0; print "\n" }}
| <var> = <expression> ; {{ $P0 = match[’expression’];
$S0 = match[’var’]; set_global $S0, $P0 }}
}
8M x417A7DF:mem_sys_allocat
x417A73D:mem_sys_allocat
6M
x417A82F:mem__internal_a
4M
heap-admin
2M
x417A880:mem__sys_reallo
0M
0.0 2000.0 4000.0 6000.0 8000.0 10000.0 12000.0 ms
x80492D9:yyalloc
60k
x40625FE:g_malloc0
40k
20k x401914F:posix_memalign
0k
0.0 50.0 100.0 150.0 200.0 250.0 300.0 350.0 ms
x40625FE:g_malloc0
6k
x401914F:posix_memalign
4k
x8048BD2:ParseAlloc
2k
heap-admin
0k
0.0 50.0 100.0 150.0 200.0 250.0 300.0 ms
10
0.1
0.01
0.001
10 100 1000 10000 100000 1e+06 1e+07
Test Size (lines)