Unexpected speed behaviour when benchmarking Perl regexs -
whilst discussing relative merits of using index()
in perl search substrings decided write micro benchmark prove had seen before index faster regular expressions when looking substring. here benchmarking code:
use strict; use warnings; use benchmark qw(:all); @random_data; (1..100000) { push(@random_data, int(rand(1000))); } $warn_about_counts = 0; $count = 100; $search = '99'; cmpthese($count, { 'using regex' => sub { $instances = 0; $regex = qr/$search/; foreach $i (@random_data) { $instances++ if $i =~ $regex; } warn $instances if $warn_about_counts; return; }, 'uncompiled regex scalar' => sub { $instances = 0; foreach $i (@random_data) { $instances++ if $i =~ /$search/; } warn $instances if $warn_about_counts; return; }, 'uncompiled regex literal' => sub { $instances = 0; foreach $i (@random_data) { $instances++ if $i =~ /99/; } warn $instances if $warn_about_counts; return; }, 'using index' => sub { $instances = 0; foreach $i (@random_data) { $instances++ if index($i, $search) > -1; } warn $instances if $warn_about_counts; return; }, });
what surprised @ how these performed (using perl 5.10.0 on recent macbook pro). in descending order of speed:
- uncompiled regex literal (69.0 ops/sec)
- using index (61.0 ops/sec)
- uncompiled regex scalar (56.8 ops/sec)
- using regex (17.0 ops/sec)
can offer explanation voodoo perl using speed of 2 uncomplied regular expressions perform index operation? issue in data i've used generate benchmark (looking occurrence of 99 in 100,000 random integers) or perl able runtime optimisation?
wholesale revision
in light of @ven'tatsu's comment, changed benchmark bit:
use strict; use warnings; use benchmark qw(cmpthese); use data::random qw( rand_words ); use data::random::wordlist; $wl = data::random::wordlist->new; @data_1 = (rand_words( size => 10000 )) x 10; @data_2 = @data_1; $pat = 'a(?=b)'; $re = qr/^$pat/; cmpthese(1, { 'qr/$search/' => sub { $instances = grep /$re/, @data_1; return; }, 'm/$search/' => sub { $search = 'a(?=b)'; $instances = grep /^$search/, @data_2; return; }, });
on windows xp activestate perl
5.10.1:
rate qr/$search/ m/$search/ qr/$search/ 5.40/s -- -73% m/$search/ 20.1/s 272% --
on windows xp strawberry perl
5.12.1:
rate qr/$search/ m/$search/ qr/$search/ 6.42/s -- -66% m/$search/ 18.6/s 190% --
on archlinux bleadperl
:
rate qr/$search/ m/$search/ qr/$search/ 9.25/s -- -38% m/$search/ 14.8/s 60% --
Comments
Post a Comment