DBIx::Class join pruning

DBIx::Class has a join optimiser to prune out any unnecessary joins from the queries it generates.  Occasionally this can cause code that looks good to fail unexpectedly.  The most obvious time this normally happens when using literal SQL.

The most common example of this is when doing an aggregate operation with a group by.  Here I’m trying to sum and count some values on a joined table.  This looks like pretty standard DBIC code, but it fails.  I don’t use the linked table in the group by, just in the columns where it’s my own raw SQL.

sub total
    my $self = shift;
    my @columns = $self->result_source->columns;
    my $me = $self->current_source_alias;
    return $self->search(undef, {
        join => ['items'],
        '+columns' => [
            { total => \'sum(amount) as total' },
            { lines => \'count(amount) as lines' },
        group_by => [map {"$me.$_"} @columns],

        total => { '>' => 10 }
        '+columns' => ['total', 'lines']


DBI Exception: DBD::Pg::st execute failed: ERROR:  column "amount" does not exist
LINE 1: ... (SELECT "me"."id", "me"."name", "me"."type", sum(amount) as...
                                                             ^ [for Statement "SELECT "me"."id", "me"."name", "me"."type", "me"."total", "me"."lines" FROM (SELECT "me"."id", "me"."name", "me"."type", sum(amount) as total, count(amount) as lines FROM "orders" "me" GROUP BY "me"."id", "me"."name", "me"."type") "me" WHERE ( "total" > ? )" with ParamValues: 1='10']

Looking closely at the SQL generated the items alias isn’t joined in.  The optimiser has pruned that join because it couldn’t see any reference to the column.  It turns out that DBIC does really try to do the right thing, it’s just we didn’t give it enough info to give it a hint.

Adding the ‘items.’ qualifier to the column names in the SQL allows DBIC to realise that that join is necessary.

   return $self->search(undef, {
        join => ['items'],
        '+columns' => [
            # NOTE: fully qualified name
            # even though they are not ambiguous.
            { total => \'sum(items.amount) as total' },
            { lines => \'count(items.amount) as lines' },
        group_by => [map {"$me.$_"} @columns],

With that table alias, not strictly necessary for the SQL since those column names are unambiguous, DBIC realises that it is needs the join.

SELECT "me"."id", "me"."name", "me"."type", "me"."total", "me"."lines"
  FROM (
    SELECT "me"."id", "me"."name", "me"."type", sum( items.amount ) AS total, count( items.amount ) AS lines
      FROM "orders" "me"
      LEFT JOIN "order_items" "items"
        ON "items"."order_id" = "me"."id"
    GROUP BY "me"."id", "me"."name", "me"."type"
   ) "me"
WHERE "total" > '10'

Querying debian/ubuntu packages

The debian package management system is something I’ve been using for years, and it’s served me well for all those years. For the first decade I rarely needed to use more than the common apt-get install or dpkg -i package.deb. Listing the installed packages with dpkg -l has been useful too, but apart from apt-cache search those were all the commands I needed most of the time.

Over the past 5 years I’ve been using slightly more complex queries to figure out information about packages I’ve installed. Working at a company where we are deploying and supporting debian servers on a regular basis has meant it’s been useful to poke about the .deb infrastructure a litle more. In truth most of the time I just use 2 commands, but they seem to get me through most of what I want to do.

If I don’t know which package a program comes from I can use dpkg-query with the -S flag.

$ dpkg-query -S /sbin/ss
iproute2: /bin/ss

To figure out what is provided by a package I can use the -L flag.

$ dpkg-query -L iproute2 | grep etc

That’s also useful for checking a package is still fully installed,

for f in `dpkg-query -L python-lxml`; do if [ ! -e $f ]; then echo Missing $f; fi; done

Of course there are other commands that I use periodically, but these are the ones I’ve been using frequently enough to start remembering by heart.

Vim word breaks and perl

I use vim for coding my Perl and in the most recent versions of Ubuntu I’ve not been keen on one of the changes to the syntax defaults.  The word definition has start to include : which means that class names like Blah::Thing count as one word which isn’t how I’m used to working.  Luckily vim is very configurable and the vim irc channel is also really helpful so I was able to find where the configuration was and how to override it quickly.

Finding out that it’s related to the iskeyword configuration was pretty simple, but changing that didn’t have any effect.  It turns out it is set in a Perl syntax specific configuration so overriding it is a little more tricky.

Figuring out where a configuration setting was set is done using :verb set ?

:verb set isk?
Last set from /usr/share/vim/vim74/ftplugin/perl.vim

Knowing where that was set I could then override it with an ‘after’ configuration file.

" ~/.vim/after/ftplugin/perl.vim
set iskeyword-=:

Then when I check in vim,

:verb set isk?
Last set from ~/.vim/after/ftplugin/perl.vim

For more of a discussion of the language specific configuration and the after files see this vim wiki entry – http://vim.wikia.com/wiki/Keep_your_vimrc_file_clean

LXC failed to bring up eth0

I came across an issue today where new lxc containers on a machine were failing to get an IP address. In fact it was worse than that, they didn’t appear to have an interface for networking. Just the loopback adaptor.

root@lxc-container:~# ifup eth0
Internet Systems Consortium DHCP Client 4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Cannot find device "eth0"
Bind socket to interface: No such device
Failed to bring up eth0.

On the host machine brctl didn’t show any interfaces on the bridge.

# brctl show
bridge name	bridge id		STP enabled	interfaces
lxcbr0		8000.56847afe4a99	no		

It turns out the answer was in the machines config file. Usually located in /var/lib/lxc/$machine_name/config.

Normally these contain a section like this defining the network interfaces. In this case it was entirely missing from the configuration file.

lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = lxcbr0
lxc.network.hwaddr = 00:16:3e:99:99:99

Manually adding those settings to the configuration file setup a working eth0 on the machine when it was restarted.

In this case the reason for the missing section was a dodgy setting in the lxc.default_config setting that caused the machine setup to choke.

# cat `lxc-config lxc.default_config`
lxc.lxcpath = /lxc
lxc.network.type = veth
lxc.network.link = lxcbr0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx

In this case we had a rogue parameter, lxc.lxcpath that shouldn’t have been in the default_config. There wasn’t any visible error running lxc-create however so we didn’t spot it at first. It was only when we added -x to the shebang line in /usr/share/lxc/templates/lxc-debian that we saw an error,

root@host-machine:~# lxc-create -n nettest3 -t debian
lxc_container: unknown key lxc.lxcpath
lxc_container: Failed to parse config: lxc.lxcpath = /lxc

Once we removed the rogue lxc.lxcpath configuration setting new boxes were created as usual again.

Developing against the bleeding edge – Odoo 8.0 (before release)

I’ve been working on a project using Odoo 8.0 which at the time of writing isn’t fully released yet (the Odoo).  These are some notes on working at the bleeding edge as the release stabilises.

Working with a git version that is changing frequently means you need to upgrade your database on a fairly regular basis.  To do that the -u all command to openerp is very handy.

./openerp-server -d mydatabase -u all --stop-after-init --config=/etc/openerp/openerp-server.conf

The one feature we found caught us out initially was with the new reports.  The QWeb and bootstrap 3 were fine, it was the caching that surprised us.  As developers we tend to be trying lots of incremental changes on reports and at first it looked like our changes weren’t working.  Luckily we realised it was actually a new feature, caching.  This can be turned off against the reports assuming you have the access to the technical features.  It’s not on against every report, but invoices are one area it is used.  Partly because the reports are intended to be legal documents that shouldn’t change once they have been printed.

At the moment these are the open bugs on github I’m tracking.  If you need to apply one of the patches to get on with things just stick ‘.patch’ onto the github url and you will have a patch file you can apply in the usual fashion, i.e. patch -p1 < nnnn.patch

Some of these are caused by constant upgrading on the bleeding edge, others are things I hope get fixed before the final cut.  Note that these pull requests may not be the best way to fix the issues, but they do generally fix the symptoms so I have found them useful for getting on with my own development.

Issue 1110

Error context:
View `product.template.form`
[view_id: 259, xml_id: product.product_template_form_view_variant_button, model: product.template, parent_id: 252]


Issue 1267 – Fixed

KeyError: 'section_id'


Issue 1187

This one is just a tweak to the writeability of a few fields.


Issue 1284 – Fixed

XML RPC and None values


DBIx::Class Hackathon

I attended the first DBIx::Hackathon this weekend and I want to thank Jess for organising it, and Shadowcat who put a lot of time and effort into the event. Kudos to the sponsors Eligo and CTRL-O who provided backing.

I spent the day trying to improve the ‘where’ feature. It was my first patch to the core DBIx::Class module. Previously, I’ve reported little bugs to the project. Ribasushi, as ever was a great help with explaining what was needed in terms of fixing the issue. In fact he knows the code base so well he practically spoon fed me the solution as it’s just a minor tweak in his mind 😉

I suspect however this patch will never see the light of day. In some ways, I wish I had done more preparation for the hackathon so that I could have discussed it more intelligently in person. However, I’m glad I implemented the patch, because now it makes it easy to prove whether the feature is a good idea or not. One of the nice things about a hackathon is that you can free up a chunk of (relatively) guaranteed time for a project, something I find a lot harder to do in general life.

To try to demonstrate and test the various types of relationship I created a simple project on github. I’ve used DBIx::Class::Candy which is a small sugar layer on top of DBIx::Class so I hope you’ll forgive the slightly simpler syntax for my code. It’s a very trivial contrived 2 table database at the moment that allows me to play with some of the relationships.

My suspicion since I encountered the where feature is that it’s unfinished* and that extending it in a meaningful way will be tricky without causing lots of surprises to existing code that uses it.

The where attribute is described as useful for filtering relationships. As you’d imagine when you use it it sticks a where clause onto the search you do when following the relationship.

package …::Order;
has_many delivery_charge => '...::OrderItem', 'order_id',
            { where => { type => 'delivery' } };

In my example I have a relationship that allows me to pick out the delivery charges from the order as they are line items of the type ‘delivery’. Assuming you have fetched an order from the database and you follow the ‘delivery_charge’ accessor you’ll get an OrderItem resultset filtered to the type ‘delivery’.

$order->delivery_charge; # resultset containing order items of type ‘delivery’
# SELECT me.id, me.order_id, me.name, me.type, me.quantity, me.price 
#   FROM order_items me 
# WHERE me.order_id = '1' AND type = 'delivery'

That’s all pretty obvious really.

The problem really becomes obvious when you do a prefetch on the relationship.

my $order = $schema->resultset(‘Order’)->search({ id => 1 }, { 
prefetch => ['delivery_charge']
$order->id; # 1
my @delivery_charges = $order->delivery_charge->all; # actually returns all order lines, where clause not applied.

SELECT me.id, me.name, me.active, delivery_charge.id, delivery_charge.order_id, delivery_charge.name, delivery_charge.type, delivery_charge.quantity, delivery_charge.price 
  FROM orders me 
  LEFT JOIN order_items delivery_charge 
    ON delivery_charge.order_id = me.id 

The logical fix is to apply the where clause when joins happen. This is what the patch I created does.

Only where clause restrictions and joins are tricky beasts. There are two fundamental problems.

  1. Conceptually you probably really want the where clause to become a clause in a left outer join. I’ll explain more in a moment.
  2. You need to worry about table aliases. Since you’re now dealing with multiple tables you need to worry about ambiguity.

The first problem is the more serious and, in my opinion, will kill the current patch, and possibly any realistic attempt to fixing this feature.

Lets cover the first point now. The prefetch is a good example of how we want to get a list of orders, and we want the delivery charge items loaded at the same time so that we don’t have to make subsequent database calls. With the where clause being generated a query looks something like this,

SELECT me.id, me.name, me.active, delivery_charge.id, delivery_charge.order_id, delivery_charge.name, delivery_charge.type, delivery_charge.quantity, delivery_charge.price 
  FROM orders me 
  LEFT JOIN order_items delivery_charge 
    ON delivery_charge.order_id = me.id 
WHERE type = 'delivery'

Now consider a query for orders placed for today. When we make that query we expect to see all the orders placed today. If however we have some orders with delivery charges, and some that have none because they were collected in store, guess what happens when that where clause kicks in. It ends up filtering out the orders without any delivery charges. This is an SQL 101 type of problem that you generally solve with a left join and the clause promoted into the join condition. In fact the query generated already half does that, it does just that with the id link, so that the missing items shouldn’t prevent the main thing from being found. It’s just the where clause is in the wrong place. With the join ‘fixed’ you end up being surprised by your results when you join on one a relationship with a where clause.

The second problem will cause things to fail more visibly. Lets consider if we add a type field to our Order class now. A perfectly logical thing, we could add types like ‘store’, ‘internet’, ‘mail-order’. Then the query generated blows up again. You could hard code in a table alias into the where clause but that will fail when you have complex queries and you haven’t predicted the generated table alias correctly. DBIC doesn’t know what to do to automatically fix the reference and there isn’t any special syntax to do so.

I think the first problem is the more serious, and is the reason the feature should be left as it is. The second I think will ensure that it is because it’s likely to visibly blow up legacy code. It’s the legacy code that really makes fixing this awkward. As ribasushi made clear on the day, it would be fixable in a reasonably clear way by making the join condition a sub, but the existing code as it stands will fail if we add it to the join condition. It would be possible to add new semantics that enable newer style behaviour and fall back to the current behaviour when the old style declaration is used, but is it worth it? The where keyname seems like a good idea, except is it really? Assuming you don’t want problem 1 – the overly restricted search results – it won’t actually be a where clause in a join, so ‘where’ doesn’t make so much sense. If we wanted syntactic sugar to make this feature easy to use and just work we’d be better off coming up with a new key and semantics. The ‘where’ feature I would rather leave as a footnote in the documentation as something to avoid in newer code.

The sane way to achieve the same result these days is with the ‘Custom join conditions’ documented in DBIx::Class:Relationship::Base. They allow you to produce arbitrarily complex conditions. Because it takes a sub it becomes easy to deal with the ambiguity issue clearly and it also becomes possible to do far more complex things too.

has_many standard_products => '...::OrderItem',
            sub {         
                my $args = shift;
                return ( 
                        "$args->{self_alias}.id" => { -ident => "$args->{foreign_alias}.order_id" },
                        "$args->{foreign_alias}.type" => 'standard',                            
            }; # 'Custom join conditions' perldoc DBIx::Class:Relationship::Base

The downside of this compared to the where keyword is that it requires you to specify the whole condition, even the id join, rather than simply bolting on an extra condition. It’s great for allowing you to do anything, but the simple case is a little cumbersome.

See my previous blog post for an example of altering the join at runtime.

* apologies to the person whose baby I am calling ugly.

New Salt version break

I started getting these error messages last week and I discovered it was actually caused by an incompatibility between the new version 2014.01 and an old server 0.17. In general if you’re finding strange errors, particularly with configuration that used to work, double check your version numbers match. You can always do a ‘salt machine test.version’ to check the clients version.

salt machine state.highstate
	State: - no
	Name:  	states
	Function:  None
    	Result:	False
    	Comment:   No Top file or external nodes data matches found

Or via salt-call state.highstate

[ERROR   ] Got a bad pillar from master, type bool, expecting dict: False

LXC Networking

My infrequently used laptop appears to have developed some issues with the networking on new lxc containers (Ubuntu 13.10) so I finally started to dig into how it’s setup. I first blew away the caches in /var/cache/lxc, then tried installing a fresh box with a completely stock template. The network interface didn’t acquire one of the usual network ip addresses. A look in the logs suggested the dhcp server was receiving a DHCP request, and it was sending back a lease, but the client wasn’t picking that up.

tail /var/log/syslog
dnsmasq-dhcp[1261]: DHCPDISCOVER(lxcbr0) 00:22:3e:ea:aa:fa 
dnsmasq-dhcp[1261]: DHCPOFFER(lxcbr0) 00:22:3e:ea:aa:fa 
dnsmasq-dhcp[1261]: DHCPDISCOVER(lxcbr0) 00:22:3e:ea:aa:fa 
dnsmasq-dhcp[1261]: DHCPOFFER(lxcbr0) 00:22:3e:ea:aa:fa 
dnsmasq-dhcp[1261]: DHCPDISCOVER(lxcbr0) 00:22:3e:ea:aa:fa 
dnsmasq-dhcp[1261]: DHCPOFFER(lxcbr0) 00:22:3e:ea:aa:fa 

A google suggested my problem was the UDP checksums – bug 1204069. A look using tcpdump suggested that indeed my checksums were corrupt; note the ‘bad udp cksum’ on the second packet.

sudo tcpdump -vvv -i lxcbr0
21:36:34.009788 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328) > [udp sum ok] BOOTP/DHCP, Request from 00:22:3e:ea:aa:fa (oui Unknown), length 300, xid 0xb407cc75, secs 49, Flags [none] (0x0000)
	  Client-Ethernet-Address 00:22:3e:ea:aa:fa (oui Unknown)
	  Vendor-rfc1048 Extensions
	    DHCP-Message Option 53, length 1: Discover
	    Parameter-Request Option 55, length 13: 
	      Subnet-Mask, BR, Time-Zone, Default-Gateway
	      Domain-Name, Domain-Name-Server, Option 119, Hostname
	      Netbios-Name-Server, Netbios-Scope, MTU, Classless-Static-Route
	    END Option 255, length 0
	    PAD Option 0, length 0, occurs 41
21:36:34.010027 IP (tos 0xc0, ttl 64, id 44699, offset 0, flags [none], proto UDP (17), length 328) > [bad udp cksum 0x1c2d -> 0x1a4c!] BOOTP/DHCP, Reply, length 300, xid 0xb407cc75, secs 49, Flags [none] (0x0000)
	  Client-Ethernet-Address 00:22:3e:ea:aa:fa (oui Unknown)
	  Vendor-rfc1048 Extensions
	    DHCP-Message Option 53, length 1: Offer
	    Server-ID Option 54, length 4:
	    Lease-Time Option 51, length 4: 3600
	    RN Option 58, length 4: 1800
	    RB Option 59, length 4: 3150
	    Subnet-Mask Option 1, length 4:
	    BR Option 28, length 4:
	    Default-Gateway Option 3, length 4:
	    Domain-Name-Server Option 6, length 4:
	    END Option 255, length 0
	    PAD Option 0, length 0, occurs 8

The bug suggests that that should no longer be a problem, so I carried on googling. Turning my search criteria to DHCP udp problems. That came up with bug 930962 which gave a potential work around and suggested that it should be fixed. Since it said there should already be a firewall fix in the config I decided to take a look at the configuration. To figure out where that was I used dpkg -L

dpkg -L lxc

After a little digging I found the network configuration in /etc/init/lxc-net.conf where the setup of all the networking out of the box that I so like about ubuntu was all there. There was no mangle line as suggested in the bug report. At that point I checked the version I was running, and the version on the bug report and realised that was a pretty recent bug fix, and I simply don’t have it. Rather than figure out where the fixed package is, I figured I may as well just try to fix it myself for now since I’ve come so far.

sudo iptables -t mangle -A POSTROUTING -o lxcbr0 -p udp --dport bootpc -j CHECKSUM --checksum-fill

After that the networking came up, or more specifically the server obtained an ip address and I was able to ssh to it. I added pretty much that line into my /etc/init.lxc-net.conf and it now works out of the box again and I have a better understanding of how lxc sets up it’s network. I’ve also finally solved a case of checksum offload issues for myself rather than just hearing about them. Possibly in the strangest way, i.e. not turning off the network offload of the checksums.

Of course the adventure didn’t end there. I then found that salt wasn’t working. When I looked for salt keys I didn’t find any requests from new servers. A tcpdump showed a similar checksum issue with the salt traffic. A quick hack with iptables again solved that problem.

sudo iptables -t mangle -A POSTROUTING -o lxcbr0 -p udp --dport 4505 -j CHECKSUM --checksum-fill
sudo iptables -t mangle -A POSTROUTING -o lxcbr0 -p udp --sport 4505 -j CHECKSUM --checksum-fill
sudo iptables -t mangle -A POSTROUTING -o lxcbr0 -p udp --dport 4506 -j CHECKSUM --checksum-fill
sudo iptables -t mangle -A POSTROUTING -o lxcbr0 -p udp --sport 4506 -j CHECKSUM --checksum-fill

Testing out uninstall to fix a perl modules dependency issue

Just recently after an upgrade to some CPAN modules I started getting this crash on one of my machines when the Catalyst::View::JSON was loaded.

#     Error:  Couldn't instantiate component "TestApp::View::JSON", "Recursive inheritance detected in package 'Types::Serialiser::BooleanBase' at (eval 1547) line 76."Compilation failed in require at (eval 4) line 2.

The actual source of that error appears to be JSON::XS rather than Types::Serialiser::BooleanBase or TestApp::View::JSON.

I didn’t investigate the error properly, or really fix it properly. Instead I tested out one of the newer features of cpanm, the -U uninstall flag. I simply uninstalled JSON::XS and hey presto, no more crash.

cpanm -U JSON::XS
JSON::XS contains the following files:


Are you sure you want to uninstall JSON::XS? [y] y

That probably warrants some explanation. The new Catalyst::Runtime now appears to pull in the new alternative to JSON::XS, Cpanel::JSON::XS so this can now be used instead, and so things just worked. It’s probably a bit drastic a solution for most systems at the moment, I’m sure it will demonstrate any places where I have direct dependencies on JSON::XS. On my development box that should be handy however. I’d rather be using a single library for that single purpose.

Getting started with Salt

I’ve finally started to use the salt stack in anger now and I thought I’d make some quick notes on what I thought I was slow to pick up on.

Specifying that a box needs to have x services installed is all done in the configuration. I understood that there are two aspects to salt, one is running things via the command line, the second is configuration management. I thought I’d be able to do something like ‘salt provision box-a service-b’. Instead you generally put the machine names and what they should have in top.sls in the /srv/salt directory along with all the other config.

Pillars appears to be intended to provide the site specific data generally. In general the salt .sls files are the config that can be used everywhere, sometimes templated, and the pillar data can be used in those templates to insert data specific to a site/machine. This means you should be able to use the same basic salt setup on multiple salt-masters, and simply change the data on them to generate machines with different users and setups.

Salt appears to be very opaque to start with. It took me too long to realise that like most things, it logs. In fact it logs very well so you can really turn the log level up to a very high level if you are troubleshooting. It can be interesting to turn it up if you want to see what it’s doing in real time, otherwise you just see problems logged.

It’s well worth taking a look at the state documentation as you’re looking at the tutorials. I found the examples in the tutorial relatively hard to pick apart until I was able to see the references for the various states.


The file state is particularly worth taking a look at. A lot of deployment is pushing files about and modifying them.


Make sure you run the latest versions from salt’s official repositories if possible. The ones packaged with your OS are likely to be old. And don’t run inconsistent versions between your masters and minions. That way leads to fail.

If something obvious doesn’t appear to work, try it with a different package. I found my simple couchdb installation didn’t work quite right out of the box. I tried exactly the same thing with memcached and it did. This turned out to be a bug.

I’m still early on my journey into salt, but so far it appears to be very useful.