Gearman::Driver

| | Comments (0) | TrackBacks (0)
Gearman was initially developed by Danga. It's a very nice framework to distribute (sometimes long running) tasks across multiple servers. For a more detailed description see gearman.org where you will also find a rewrite of it in C. First let's take a look at it without Gearman::Driver, using a plain Gearman::XS setup:
#!/usr/bin/env perl
use strict;
use warnings;
use Gearman::XS::Worker;
use Gearman::XS qw(:constants);
use Imager;

my $worker = Gearman::XS::Worker->new;
$worker->add_server( 'localhost', 4730 );

$worker->add_function( "convert_to_jpeg", 0, \&convert_to_jpeg, {} );
$worker->add_function( "convert_to_gif",  0, \&convert_to_gif,  {} );

while (1) {
    my $ret = $worker->work();
    if ( $ret != GEARMAN_SUCCESS ) {
        printf( STDERR "%s\n", $worker->error() );
    }
}

sub convert_to_jpeg {
    my ($job) = @_;
    return _convert( $job->workload, 'jpeg' );
}

sub convert_to_gif {
    my ($job) = @_;
    return _convert( $job->workload, 'gif' );
}

sub _convert {
    my ( $in_data, $format ) = @_;
    my $img = Imager->new();
    my $out_data;
    $img->read( data => $in_data ) or die;
    $img->write( data => \$out_data, type => $format ) or die;
    return $out_data;
}
The problem in this design is to have two functions in the same script. If you depend on running convert_to_jpeg and convert_to_gif at the same time you could split up the script:
#!/usr/bin/env perl
use strict;
use warnings;
use Gearman::XS::Worker;
use Gearman::XS qw(:constants);
use Imager;

my $worker = Gearman::XS::Worker->new;
$worker->add_server( 'localhost', 4730 );

$worker->add_function( "convert_to_jpeg", 0, \&convert_to_jpeg, {} );

while (1) {
    my $ret = $worker->work();
    if ( $ret != GEARMAN_SUCCESS ) {
        printf( STDERR "%s\n", $worker->error() );
    }
}

sub convert_to_jpeg {
    my ($job) = @_;
    return _convert( $job->workload, 'jpeg' );
}

sub _convert {
    my ( $in_data, $format ) = @_;
    my $img = Imager->new();
    my $out_data;
    $img->read( data => $in_data ) or die;
    $img->write( data => \$out_data, type => $format ) or die;
    return $out_data;
}
#!/usr/bin/env perl
use strict;
use warnings;
use Gearman::XS::Worker;
use Gearman::XS qw(:constants);
use Imager;

my $worker = Gearman::XS::Worker->new;
$worker->add_server( 'localhost', 4730 );

$worker->add_function( "convert_to_gif",  0, \&convert_to_gif,  {} );

while (1) {
    my $ret = $worker->work();
    if ( $ret != GEARMAN_SUCCESS ) {
        printf( STDERR "%s\n", $worker->error() );
    }
}

sub convert_to_gif {
    my ($job) = @_;
    return _convert( $job->workload, 'gif' );
}

sub _convert {
    my ( $in_data, $format ) = @_;
    my $img = Imager->new();
    my $out_data;
    $img->read( data => $in_data ) or die;
    $img->write( data => \$out_data, type => $format ) or die;
    return $out_data;
}
The next problem is about memory: Having many scripts running in separate processes using the same (sometimes heavyweight) library is not very smart. And especially in this example where images are processed those separate processes may consume a lot of ram after a long runtime. So it would be cool to share the memory consumed by libraries as well as freeing up memory from time to time. This is where Gearman::Driver comes in handy. It's not only sharing memory by using the advantages of copy-on-write but it's also having other cool features like restarting worker processes after some idle time to free up memory. Besides that it got some telnet interface to change amount of processes on runtime as well as getting some statistics et cetera. So let's re-implement the example above:
package GDExamples::Convert;

use base qw(Gearman::Driver::Worker);
use Moose;
use Imager;

sub process_name {
    my ( $self, $orig, $job_name ) = @_;
    return "$orig ($job_name)";
}

sub convert_to_jpeg : Job : MinProcesses(0) : MaxProcesses(5) {
    my ( $self, $job, $workload ) = @_;
    return _convert( $workload, 'jpeg' );
}

sub convert_to_gif : Job : MinProcesses(0) : MaxProcesses(5) {
    my ( $self, $job, $workload ) = @_;
    return _convert( $workload, 'gif' );
}

sub _convert {
    my ( $in_data, $format ) = @_;
    my $img = Imager->new();
    my $out_data;
    $img->read( data => $in_data ) or die;
    $img->write( data => \$out_data, type => $format ) or die;
    return $out_data;
}
To run that worker you can use the script called gearman_driver.pl which is part of the Gearman::Driver distribution.
usage: gearman_driver.pl [long options...]
        --loglevel           Log level (default: INFO)
        --lib                Example: --lib ./lib --lib /custom/lib
        --max_idle_time      How many seconds a worker may be idle before its killed
        --server             Gearman host[:port][,host[:port]]
        --logfile            Path to logfile (default: gearman_driver.log)
        --console_port       Port of management console (default: 47300)
        --interval           Interval in seconds (see Gearman::Driver::Observer)
        --loglayout          Log message layout (default: [%d] %p %m%n)
        --namespaces         Example: --namespaces My::Workers --namespaces My::OtherWorkers

Run the example: gearman_driver.pl --namespaces GDExamples &

You can easily test the workers using the gearman client:

  • gearman -f GDExamples::Convert::convert_to_jpeg < cpan.png > cpan.jpg
  • gearman -f GDExamples::Convert::convert_to_gif < cpan.png > cpan.gif

Now let's see what we did. First of all you may have noticed the attributes on both methods:

  • sub convert_to_jpeg : Job : MinProcesses(0) : MaxProcesses(5) {}
  • sub convert_to_jpeg : Job : MinProcesses(0) : MaxProcesses(5) {}
  • sub convert_to_jpeg : Job : MinProcesses(0) : MaxProcesses(5) {}

Basically you only need to add the attribute Job to the methods which should be registered with gearmand. All other attributes are optional. Your method will be registered with the full package name:

  • GDExamples::Convert::convert_to_jpeg
  • GDExamples::Convert::convert_to_gif
If you don't like that behaviour you can override prefix. By default it's implemented this way:
sub prefix {
    return ref(shift) . '::';
}

There are other predefined methods which can be overridden, like begin or end. Both are called whenever a job function is called, begin before the job method is called and end afterwards (even if the job method died).

The other two attributes tell Gearman::Driver how many processes should be forked and work on that function/job. If you set MinProcesses to 0 no process is preforked at all. It's being forked on demand: Gearman::Driver monitors the telnet interface of gearmand for queued jobs to see if it's necessary to fork a new process. If there are many jobs in the queue it will fork even more processes until MaxProcesses is reached. After all jobs are done it will kill all processes again. The default behaviour is to do that immediately, but can be changed by setting max_idle_time to 300 (seconds) for example.

Gearman::Driver itself got a telnet interface as well. There's also a client coming with the distribution having readline support. Another feature is to connect to multiple servers at once to send the same commands to all servers. Really handy in a big environment.

gearman_driver_console.pl --server localhost:47300
console> status
localhost:47300> GDExamples::Convert::convert_to_gif   0  5  0  2010-01-30T19:45:43  1970-01-01T00:00:00   
localhost:47300> GDExamples::Convert::convert_to_jpeg  0  5  0  2010-01-30T19:45:53  1970-01-01T00:00:00   
localhost:47300> .
console> set_processes GDExamples::Convert::convert_to_gif 2 10
localhost:47300> OK
localhost:47300> .
console> status
localhost:47300> GDExamples::Convert::convert_to_gif   2  10  2  2010-01-30T19:45:43  1970-01-01T00:00:00   
localhost:47300> GDExamples::Convert::convert_to_jpeg  0   5  0  2010-01-30T19:45:53  1970-01-01T00:00:00   
localhost:47300> .

I've just shipped v0.01017 to the CPAN containing the examples in this blog post. If you can't wait for it, you can fetch it from GitHub. Any comments/suggestions/rants will be highly appreciated. Thanks.

Textmate + Perl + New Package

| | Comments (0) | TrackBacks (0)
Everytime you create a new file in your project you need to type the package name:
lib/My/Project/DB/Result/Person.pm => My::Project::DB::Result::Person
That's really annoying, boring and prone to error. For Textmate users there's a simple solution (notice: the script expects your modules in directory "lib"):
#!/usr/bin/env perl
use strict;
use warnings;

my $package = get_package( path => $ENV{TM_FILEPATH}, walk_up_to => 'lib' );
my $content = get_content();

if ($content) {
    $content =~ s/^package (.*?);/package $package;/;
    print $content;
}
else {
    print "package $package;\n\nuse strict;\nuse warnings;\n\n1;";
}

sub get_content {
    local $/ = undef;
    return <>;
}

sub get_package {
    my (%args) = @_;
    my @chunks = split qr~/~, $args{path};
    while ( @chunks > 0 ) {
        last if $chunks[0] eq $args{walk_up_to};
        shift @chunks;
    }
    shift @chunks;
    my $pkg = join '::', @chunks;
    $pkg =~ s/\.pm//g;
    return $pkg;
}

Save that script and setup a new command in Textmates bundle editor as shown in the screenshot:
bundle editor
Next time you create a new perl module just hit Cmd + Ctrl + p and select "Filename to Package":
context menu
If the file is empty the script will insert following code:
package My::Project::DB::Result::Person;

use strict;
use warnings;

1;
If the file contained code already the script will just update the "package ...;" line with the new value. This is very handy if you move a file around and just want to update the package name.

Catalyst + DBIC + DAO

| | Comments (0) | TrackBacks (0)
Recently there was a question on the Catalyst mailing list which i've been asked by some teammate too: "How do i create DataAccessObjects with methods i can use in Catalyst as well as in CronJobs or other scripts?"

You're too lazy to read the full entry, you just want to see the code? It's available on GitHub: http://github.com/plu/dao-example/tree/master
It's quite easy, all you need is custom DBIx::Class ResultSets. I guess you've already setup a model in your Catalyst app:

package DAO::Example::Model::DB;

use strict;
use warnings;
use base 'Catalyst::Model::DBIC::Schema';

1;

...and a config file that might look like:
# rename this file to DAO::Example.yml and put a : in front of "name" if
# you want to use yaml like in old versions of Catalyst
name DAO::Example


    schema_class    DAO::Example::DB
    connect_info    dbi:SQLite:dao_example.db
    connect_info    username
    connect_info    password

To get custom resultsets for all of your result classes setup the schema using load_namespaces:
package DAO::Example::DB;

use strict;
use warnings;
use base qw/DBIx::Class::Schema/;

__PACKAGE__->load_namespaces( default_resultset_class => '+DAO::Example::DB::Base::ResultSet' );

1;

Going that way your result classes will be expected in DAO::Example::DB::Result:: namespace, resultsets in DAO::Example::DB::ResultSet::. If you create a class DAO::Example::DB::Result::Person DBIC will look if DAO::Example::DB::ResultSet::Person exists and inherit all person resultsets from that class. If there's no such class the resultset will be inherited from DAO::Example::DB::Base::ResultSet.

Let's add some methods to the person resultset:
package DAO::Example::DB::ResultSet::Person;

use strict;
use warnings;
use base qw/DAO::Example::DB::Base::ResultSet/;

sub by_username {
    my ( $rs, $username ) = @_;
    return $rs->search( { 'me.username' => $username }, { key => 'unique_username' } );
}

sub prefetch_all {
    my ($rs) = @_;
    return $rs->search( {}, { prefetch => [ { personroles => [qw/role/] } ] } );
}

1;

And to the default resultset:
package DAO::Example::DB::Base::ResultSet;

use strict;
use warnings;
use base qw/DBIx::Class::ResultSet::HashRef DAO::Example::DB::Base::Any/;

sub active {
    my ($rs) = @_;
    return $rs->search( { 'me.active' => 1 } );
}

sub inactive {
    my ($rs) = @_;
    return $rs->search( { 'me.active' => { '!=' => 1 } } );
}

1;

In catalyst you would call these methods that way:
package DAO::Example::Controller::Root;

use strict;
use warnings;
use parent 'Catalyst::Controller';
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;

__PACKAGE__->config->{namespace} = '';

sub index : Path : Args(0) {
    my ( $self, $c ) = @_;

    unless ( -e $c->path_to('dao_example.db') ) {
        $c->model('DB')->schema->deploy;
        $c->model('DB')->schema->init;
    }

    $c->res->print('
');
    $c->res->print( Dumper $c->config );
    $c->res->print( Dumper $c->model('DB')->resultset('Person')->active->hashref_array );
    $c->res->print( Dumper $c->model('DB')->resultset('Person')->inactive->hashref_array );
    $c->res->print( Dumper $c->model('DB')->resultset('Person')->by_username('plu')->prefetch_all->hashref_array );
    $c->res->print('
'); }

The method hashref_array comes from DBIx::Class::ResultSet::HashRef. How would you call that from a CronJob / without Catalyst?
#!/usr/bin/env perl
use strict;
use warnings;
use FindBin;
use lib "$FindBin::Bin/../lib";
use DAO::Example::Utils qw/schema config/;
use Data::Dumper;
$Data::Dumper::Sortkeys=1;

unless (-e "$FindBin::Bin/../dao_example.db") {
    schema->deploy;
    schema->init;
}

print Dumper config;

print Dumper schema->resultset('Person')->active->hashref_array;

print Dumper schema->resultset('Person')->inactive->hashref_array;

print Dumper schema->resultset('Person')->by_username('plu')->prefetch_all->hashref_array;

The "magic" thing is to instantiate a DBIx::Class::Schema object using your Catalyst config file. I tend to write a small utility class to achieve that:
package DAO::Example::Utils;

use strict;
use warnings;
use base 'Exporter';
use Config::JFDI;
use DAO::Example::DB;

use vars qw/@EXPORT_OK $schema $config/;

@EXPORT_OK = qw/
  schema
  config
  /;

sub config {
    return $config if defined $config;
    $config = Config::JFDI->new( name => "DAO::Example" )->get;
    return $config;
}

sub schema {
    return $schema if defined $schema;
    $schema = DAO::Example::DB->connect( @{ config->{'Model::DB'}{connect_info} || [] } );
    return $schema;
}

1;

irssi: Twitter

| | Comments (0) | TrackBacks (0)
  • install all dependencies (POE, POE::Loop::Glib, POE::Session::Irssi, Net::Twitter, HTML::Entities, HTTP::Date)
  • change credentials
  • /script load /path/to/twitter.pl
  • /window new hidden
  • change to the new window and type /window name twitter
use strict;
use warnings;
use Irssi;
use Glib;
use POE qw(Loop::Glib Session::Irssi);
use Net::Twitter;
use HTML::Entities qw( decode_entities );
use HTTP::Date qw( time2str );

my $VERSION = '0.1';
my %IRSSI   = (
    authors => 'Johannes Plunien',
    contact => 'http://www.pqpq.de/contact/',
    name    => 'Twitter',
    license => 'Perl',
);  

my %twitter_config = (
    username        => 'plutooth',
    password        => 'secret',
    update_interval => 90,              # seconds
);

POE::Session::Irssi->create(
    irssi_commands => {
        tmsg => sub {
            $_[HEAP]->{twitter}->update(join " ", (@{ $_[ARG1] })[0]);
            $_[KERNEL]->delay( update => 5 );
        },   
    },
    inline_states => {
        _start => sub {
            my $heap = $_[HEAP];
            $heap->{window} = Irssi::window_find_name('twitter');
            Irssi::print("Create a window named 'twitter'") if !$heap->{window};
            my ( $kernel, $heap ) = @_[ KERNEL, HEAP ];
            $heap->{interval} = delete $twitter_config{update_interval};
            $heap->{twitter}  = Net::Twitter->new(%twitter_config)
              or Irssi::print("Twitter ERROR: $!");
            $kernel->delay( update => 2 );
        },   
        update => sub {
            my ( $kernel, $heap ) = @_[ KERNEL, HEAP ];
            my $params = {};
            $params->{since} = $heap->{last_update} if defined $heap->{last_update};
            my $timeline    = $heap->{twitter}->friends_timeline($params);
            my $http_status = $heap->{twitter}->http_code;
            $heap->{window} = Irssi::window_find_name('twitter');
            if ( $http_status == 200 && $heap->{window} ) {
                $heap->{last_update} = time2str(time);
                $kernel->yield( display => $timeline ) if scalar @$timeline > 0;
            }
            $kernel->delay( update => $heap->{interval} );
        },   
        display => sub {
            my ( $kernel, $heap, $timeline ) = @_[ KERNEL, HEAP, ARG0 ];
            foreach my $row ( reverse @$timeline ) {
                my $screen_name = $row->{user}{screen_name};
                my $text = $row->{text};
                $text =~ s/\n/\n   /;
                my $cleantext = decode_entities($text);
                my $message = "<$screen_name> $cleantext";
                $heap->{window}->print($message, MSGLEVEL_PUBLIC);
            }
        },
    },
);

irssi: Close all queries

| | Comments (0) | TrackBacks (0)
This irssi script creates a new command to close all open queries at once. I don't know what else to explain here, i think the script explains itself.
use strict;
use warnings;
use Irssi;
use Glib;
use POE qw(Loop::Glib Session::Irssi);

my $VERSION = '0.1';
my %IRSSI = (
    authors     => 'Johannes Plunien',
    contact     => 'http://www.pqpq.de/contact/',
    license     => 'Perl',
);

POE::Session::Irssi->create(
    irssi_commands => {
        qc => sub {
            foreach my $w (Irssi::windows()) {
                my $act = $w->{active};
                next unless defined $act->{type};
                $w->command("window close") if $act->{type} eq 'QUERY';
            }
        },
    },
);

Firebug debugging fallback

| | Comments (0) | TrackBacks (0)
Firebug is the best tool for debugging heavyweight javascript web applications. You can put console.log(arguments) here and console.debug(foo) there, but ... YOU SHOULD NEVER FORGET TO REMOVE IT! If you forget it, your application stops working in any browser that doesn't know those methods. To avoid this, just include this snippet:
if(!window.console || !window.console.firebug) {
    window.console = {
        debug: function(){},
        log: function(){}
    };
}
/Edit: I just found http://getfirebug.com/firebug/firebugx.js which is more complete:
if (!window.console || !console.firebug)
{
    var names = ["log", "debug", "info", "warn", "error",
                 "assert", "dir", "dirxml", "group",
                 "groupEnd", "time", "timeEnd", "count",
                 "trace", "profile", "profileEnd"];

    window.console = {};
    for (var i = 0; i < names.length; ++i)
        window.console[names[i]] = function() {}
}

Remote form validation

| | Comments (0) | TrackBacks (0)
Catalyst and ExtJS are my favourite frameworks for developing web applications. ExtJS provides a lot features to validate form data before the form is being submitted to the server. Everybody knows that you never should trust those values even if it was validated in the browser by ExtJS. Here is a comfortable way to validate form values remotely using Catalyst::Plugin::FormValidator::Simple.

The full example can be found on github:
git clone git://github.com/plu/rfv-example.git


  • Create an ExtJS FormPanel:
        var win = new Ext.Window({
            .....
            ,items:[
                new Ext.FormPanel({
                    ,url:'[% Catalyst.uri_for('/save') %]'
                    ,id: 'fp'
                    .....
                    ,items: [{
                            fieldLabel: 'First name',
                            name: 'first',
                            anchor:'100%'
                        },{
                            fieldLabel: 'Last name',
                            name: 'last',
                            anchor:'100%'
                        },{
                            fieldLabel: 'Company',
                            name: 'company',
                            anchor:'100%'
                        },{
                            fieldLabel: 'eMail',
                            name: 'email',
                            anchor:'100%'
                        }
                    ]
    
                    ,buttons: [{
                        text: 'Save',
                        handler: function() {
                            win.findById('fp').getForm().submit();
                        }
                    }]
                })
            ]
        });
    
  • Read validation results and modify form fields:
        var fp = win.findById('fp');
        fp.on('actioncomplete', function(a, fp) {
            var r = fp.result;
            for (var field in r.error) {
                fp.form.findField(field).markInvalid(
                    r.error[field].join("<br/>")
                );
            }
        });
    
  • Validate form values:
    sub save : Global {
        my ( $self, $c ) = @_;
    
        $c->form(
            first   => [ qw/NOT_BLANK ASCII/ => [qw/LENGTH 2 20/] ],
            last    => [ qw/NOT_BLANK ASCII/ => [qw/LENGTH 2 20/] ],
            company => [ qw/NOT_BLANK ASCII/ => [qw/LENGTH 2 20/] ],
            email   => [qw/NOT_BLANK EMAIL_LOOSE/],
        );
    
        my $json : Stashed = {
            error   => $c->form->field_messages('save'),
        };
    }
    

MojoMojo - make it private

| | Comments (0) | TrackBacks (0)
MojoMojo is a wiki written in Catalyst and DBIx::Class. It was the best choice to replace my private MediaWiki. The installation is straight forward:
svn co http://code2.0beta.co.uk/mojomojo/svn/trunk/ MojoMojo
cd MojoMojo
perl Makefile.PL
make
The standard way to install MojoMojo is through CPAN. You can find the distribution for it here, or you can install it from the command line like this:
sudo cpan MojoMojo
After successful installation MojoMojo needs to be configured using its mojomojo.conf. Generally you can just edit this file directly, but if you're using the svn checkout and probably want to contribute some changes later you better go for a new mojomojo_local.conf. So you can have your database credentials in that file and you don't have to worry about committing it by accident.
<Model::DBIC>
    connect_info   dbi:mysql:mojomojo
    connect_info   mojomojo
    connect_info   sTrOnGpAsSwOrD
</Model::DBIC>
Create database tables:
./script/mojomojo_spawn_db.pl
Done. Now you can deploy MojoMojo using mod_perl, FastCGI or just run it using Catalysts development webserver:
./script/mojomojo_server -p 3000
My previous MediaWiki installation was only accessible by username and password. So i needed to get my head around the authorization implementation in MojoMojo. First of all you need to add some lines to the config:
<permissions>
    check_permission_on_view    1
    cache_permission_data       1
    create_allowed              0
    delete_allowed              0
    edit_allowed                0
    view_allowed                0
    attachment_allowed          0
</permissions>
Additionally there are a few database changes necessary:
-- add a new role
insert into role values (null, 'user', 1);

-- add this role to the standard admin user
insert into role_member values (1, 2, 1);

-- allow everything to that new role
insert into path_permissions values
    ('/', 1, 'no', 'yes', 'yes', 'yes', 'yes', 'yes');
insert into path_permissions values
    ('/', 1, 'yes', 'yes', 'yes', 'yes', 'yes', 'yes');
How does it look like now?
mysql> select * from role;
+----+------+--------+
| id | name | active |
+----+------+--------+
|  1 | user |      1 | 
+----+------+--------+

mysql> select * from role_member;
+------+--------+-------+
| role | person | admin |
+------+--------+-------+
|    1 |      2 |     1 | 
+------+--------+-------+

mysql> select * from path_permissions\G
*************************** 1. row ***************************
              path: /
              role: 1
 apply_to_subpages: no
    create_allowed: yes
    delete_allowed: yes
      edit_allowed: yes
      view_allowed: yes
attachment_allowed: yes
*************************** 2. row ***************************
              path: /
              role: 1
 apply_to_subpages: yes
    create_allowed: yes
    delete_allowed: yes
      edit_allowed: yes
      view_allowed: yes
attachment_allowed: yes
To get this working please make sure to use at least version 0.999018 or revision 927 of MojoMojo. In next release of MojoMojo there will be also a new flag to enforce login:
<permissions>
    enforce_login 1
</permissions>