DBIx::Class join pruning

DBIx::Class has a join optimiser to prune out any unnecessary joins from the queries it generates.  Occasionally this can cause code that looks good to fail unexpectedly.  The most obvious time this normally happens when using literal SQL.

The most common example of this is when doing an aggregate operation with a group by.  Here I’m trying to sum and count some values on a joined table.  This looks like pretty standard DBIC code, but it fails.  I don’t use the linked table in the group by, just in the columns where it’s my own raw SQL.

sub total
    my $self = shift;
    my @columns = $self->result_source->columns;
    my $me = $self->current_source_alias;
    return $self->search(undef, {
        join => ['items'],
        '+columns' => [
            { total => \'sum(amount) as total' },
            { lines => \'count(amount) as lines' },
        group_by => [map {"$me.$_"} @columns],

        total => { '>' => 10 }
        '+columns' => ['total', 'lines']


DBI Exception: DBD::Pg::st execute failed: ERROR:  column "amount" does not exist
LINE 1: ... (SELECT "me"."id", "me"."name", "me"."type", sum(amount) as...
                                                             ^ [for Statement "SELECT "me"."id", "me"."name", "me"."type", "me"."total", "me"."lines" FROM (SELECT "me"."id", "me"."name", "me"."type", sum(amount) as total, count(amount) as lines FROM "orders" "me" GROUP BY "me"."id", "me"."name", "me"."type") "me" WHERE ( "total" > ? )" with ParamValues: 1='10']

Looking closely at the SQL generated the items alias isn’t joined in.  The optimiser has pruned that join because it couldn’t see any reference to the column.  It turns out that DBIC does really try to do the right thing, it’s just we didn’t give it enough info to give it a hint.

Adding the ‘items.’ qualifier to the column names in the SQL allows DBIC to realise that that join is necessary.

   return $self->search(undef, {
        join => ['items'],
        '+columns' => [
            # NOTE: fully qualified name
            # even though they are not ambiguous.
            { total => \'sum(items.amount) as total' },
            { lines => \'count(items.amount) as lines' },
        group_by => [map {"$me.$_"} @columns],

With that table alias, not strictly necessary for the SQL since those column names are unambiguous, DBIC realises that it is needs the join.

SELECT "me"."id", "me"."name", "me"."type", "me"."total", "me"."lines"
  FROM (
    SELECT "me"."id", "me"."name", "me"."type", sum( items.amount ) AS total, count( items.amount ) AS lines
      FROM "orders" "me"
      LEFT JOIN "order_items" "items"
        ON "items"."order_id" = "me"."id"
    GROUP BY "me"."id", "me"."name", "me"."type"
   ) "me"
WHERE "total" > '10'

DBIx::Class and Postgres tweaks at startup

After connection you can do simple commands thanks to the on_connect_do connection setting. One thing I sometimes do is turn down the whining. Postgres can be quite noisy when you’re creating tables,

NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "blah_pkey" for table "blah"
NOTICE:  CREATE TABLE will create implicit sequence "blah_id_seq" for serial column "blah.id"

So I sometimes SET client_min_messages=WARNING at connection. That makes deploying a schema a lot less verbal.

If you want to deploy to a different schema within a database you can also do a ‘SET search_path TO’ statement at that point too. That can be quite handy if you want to deploy the same tables again to an alternative schema within a database.

perl -I lib -MMyModule::Schema::DB -e "MyModule::Schema::DB->connect('dbi:Pg:dbname=database;host=postgres', 'username', 'password', { on_connect_do => 'SET search_path TO temp' })->deploy"

If you’re setting up a Catalyst config for DBIC you can set the connection options like this,

    connect_info dbi:Pg:dbname=database
    connect_info username
    connect_info password
        on_connect_do  [ SET client_min_messages=WARNING ]
        quote_names 1