BLOG.PE-ELL.NET - Useless rambling...
BACK
:: 2009-09-04 18:48:04 ::

Speeding up Perl...

So due to size of our environment at work and trying to keep up with our traffic (top site is <100 on Alexa and other major sites are all <4000 one of which is ~600) we have to look at many ways to optimze code.  In some cases you have done a lot of make your code more effecient but you're still suffering because of the language engine.  Since we're currently using Perl 5.8.8 on a majority of our production servers it made sense to look farther into how Perl was operating.  Now this patch won't be overtly useful for a lot of people due to lack of traffic to their sites.

So basic issue is file system stats.  It seems that Perl really likes to stat things when trying to load a file.  First of all your default @INC has 5 sub versions of Perl in it (in this case you would get 5.8.8, 5.8.7, 5.8.6, 5.8.5, etc) for all of your paths.  So leaving our additional lib paths out of the equation you're base install will have ~32 include paths that it must stat through until it files the library that you're trying to load.

Now on top of that you also have this thing called a "pmc" file.  If you every strace Perl doing things you'll see those going by too.  Those are an old functionality for pre-compiled libraries that it would search for before loading the .pm file instead.   So in the examples I'll give basically what my inital @INC looked like:

    @INC:
    /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.8
    /usr/lib/perl5/site_perl/5.8.7/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.7
    /usr/lib/perl5/site_perl/5.8.6/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.6
    /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.5
    /usr/lib/perl5/site_perl/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.4
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.8
    /usr/lib/perl5/vendor_perl/5.8.7/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.7
    /usr/lib/perl5/vendor_perl/5.8.6/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.6
    /usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.5
    /usr/lib/perl5/vendor_perl/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.4
    /usr/lib/perl5/vendor_perl
    /usr/lib/perl5/5.8.8/i386-linux-thread-multi
    /usr/lib/perl5/5.8.8
    /usr/lib/perl5/5.8.7/i386-linux-thread-multi
    /usr/lib/perl5/5.8.7
    /usr/lib/perl5/5.8.6/i386-linux-thread-multi
    /usr/lib/perl5/5.8.6
    /usr/lib/perl5/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/5.8.5
    /usr/lib/perl5/5.8.4/i386-linux-thread-multi
    /usr/lib/perl5/5.8.4
    .

So say I wanted to use a library that was in /usr/lib/perl5/5.8.8/ like File::Path.  It would not only have to stat it's way down most of the tree it would also have to stat for for a "pmc" file first in each directory.  While this seems like potentially trivial because the linux file system is pretty good at caching data, if you have a lot of libraries to load it can add a decent amount of overhead (uncached means IO hit).

So when I first was looking into this I discovered that one of our common CPAN modules was requiring 47 stats in order to load the library.  Now it takes 9.  So in our environment we don't upgrade perl (hence the previous sub versions in @INC).  Now there's a compile flag for removing the previous versions or setting it to something arbitrary, but everything I tried still gave me the old versions in the @INC (and sometimes more).  So as a result I gave up on that approach and just created a patch file for our source RPM.

If you apply this patch file it will remove all previous version from the @INC and remove the "pmc" file check when looking for libraries.  My @INC now looks like this:

   @INC:
    /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.8
    /usr/lib/perl5/site_perl
    /usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.8
    /usr/lib/perl5/vendor_perl
    /usr/lib/perl5/5.8.8/i386-linux-thread-multi
    /usr/lib/perl5/5.8.8
    .

NOTE: If you do upgrade Perl on your boxes you won't want to use the full patch since your older libraries will be in the previous version directories.  But you could trim out everything but the removal of the"pmc" file check.

perl-5.8.8-ffi.patch

And yes we use mod_perl to help keep out library reload rates as low as possible but it will happen, so make it as painless as possible.