Hand Coding SQL with psycopg2 for Odoo

I thought it would be a good idea to write down some guidelines on writing hand coded SQL for Odoo/OpenERP. One of the general themes should be ‘avoid SQL injection’, and Odoo themselves provide some documentation explaining that. They also make the point before that you should try to stick with the ORM as much as possible too. Rolling your own SQL bypasses all the access controls on records, so if you are writing some SQL, always consider whether the user is supposed to have access to the data extracted or manipulated by the SQL.

Assuming you’re still set on writing SQL it’s worth noting that Odoo uses psycopg2, a mature and respectable library for accessing Postgres from Python. Dealing with parameters in SQL clauses is simple as it allows you to pass them through as parameters, just as you’d hope.

cr.execute('select * from res_partner where name = %s', (partner_name,))

Unfortunately it’s too simplistic to say never use string concatenation. In practice you often want to stitch together a complex query based on some data. You might want to vary the search criteria based on user input.

Here are some concrete scenarios to demonstrate more complex situations. Read the psycopg2 documentation for how to pass in values, and how it converts the various types first.

In clause

This is the most common place people give up with parameterisation. It’s clearly documented in Psycopg2, but since people often don’t read the docs, they don’t realise %s works and so they end up doing something involving join and their own format strings. The syntax is really simple,

cr.execute('select * from res_partner where name in %s', (['Partner 1', 'Partner 2'],))

Optional clauses

The trick with building up optional parts is to maintain lists of parameters alongside lists of the bits of SQL you need. The exact mechanism for appending the strings isn’t terribly important, and in fact you might find it useful to write yourself a nice class to make all that string construction simple and reliable. The key thing is to never mix untrusted data into the SQL you are generating.

clauses = []
params = []
if partner_name:
    clauses.append('name = %s')
    params.append(partner_name)
if website:
    clauses.append('website = %s')
    params.append(website)
where_clause = ""
if len(clauses) > 0:
    where_clause = "where " + ' and '.join(clauses)
   
sql = "select id, name from res_partner " + where_clause
cr.execute(sql, params)

Refactoring out chunks of SQL generation

If you find yourself requiring reuse with snippets of SQL or subqueries it’s quite common to refactor that out into a method. This is another common place for mistakes. Return SQL and a list of parameters whenever you do this, rather than just a block of SQL.

def stock_level(product_ids):
    sql = """
    select location_dest_id as location_id, product_uom, sum(product_qty) as qty
    from stock_move 
    where product_id IN %s
    and state = 'done'
    group by  location_dest_id, product_uom
    union all 
    select location_id, product_uom, 0 - sum(product_qty) as qty
    from stock_move 
    where product_id IN %s
    and state = 'done'
    group by  location_id, product_uom
    """
    return (sql, [product_ids, product_ids])

snippet, params = stock_level(product_ids)
sql = """
select location_id, product_uom, sum(qty) as stock_level
from (
    %s
) as stock
group by location_id, product_uom
having sum(qty) > 0 ;
""" % snippet
cr.execute(sql, params)

Using the same parameter twice

The previous example uses the same parameter twice in the query, and passes it in twice. This can be avoided by using a named arguments like %(name)s. This can also be useful when you have a very large query requiring lots of parameters as it allows you to pass in a dictionary and not worry about the ordering of the arguments you pass it.

def stock_level(product_ids):
    sql = """
    select location_dest_id as location_id, product_uom, sum(product_qty) as qty
    from stock_move 
    where product_id IN %(product_id)s
    and state = 'done'
    group by  location_dest_id, product_uom
    union all 
    select location_id, product_uom, 0 - sum(product_qty) as qty
    from stock_move 
    where product_id IN %(product_id)s
    and state = 'done'
    group by  location_id, product_uom
    """
    return (sql, {"product_id": product_ids})

snippet, params = stock_level(product_ids)
sql = """
select location_id, product_uom, sum(qty) as stock_level
from (
    %s
) as stock
group by location_id, product_uom
having sum(qty) > 0 ;
""" % snippet
cr.execute(sql, params)

Fixing badly refactored code

If you do find yourself with a function that is returning SQL only, and you can’t change the function prototype because external code is depending on it, psycopg2 provides the mogrify function that can help. It does require the cursor to be in scope.

def stock_level(cr, uid, product_ids):
    sql = """
    select location_dest_id as location_id, product_uom, sum(product_qty) as qty
    from stock_move 
    where product_id IN %s
    and state = 'done'
    group by  location_dest_id, product_uom
    union all 
    select location_id, product_uom, 0 - sum(product_qty) as qty
    from stock_move 
    where product_id IN %s
    and state = 'done'
    group by  location_id, product_uom
    """
    return cr.mogrify(sql, [product_ids, product_ids])

snippet = stock_level(cr, uid, product_ids)
sql = """
select location_id, product_uom, sum(qty) as stock_level
from (
    %s
) as stock
group by location_id, product_uom
having sum(qty) > 0 ;
""" % snippet
cr.execute(sql)

Allowing the caller to specify fields/tables

This is something where the libraries generally don’t help. You need to sanitize this code yourself. In general try to avoid fully untrusted output. It’s better to have a whitelist of what you expect to allow, and allow them to select from those known good tables/fields. Also remember that you can query the database for what the structure of the database is if you don’t know it when writing the code. If you really can’t check either, sanitize the input heavily. Try to allow the absolute minimum characters within identifiers and also consider using an extra qualifier so that you can identify programmatically created fields/tables and prevent name clashes with core database functions.

In summary

Keep your string manipulation safe, never append untrusted input (note that data you retrieved from the database should generally be treated as untrusted). Assuming you are doing it right, any query without parameters should look start to look suspicious. There are of course simple queries that take no parameters, but they should be easy to verify as safe.

Note that the code is not from a live system and may not actually work (it was typed up for this post and not tested). It’s designed to illustrate the principles rather than be blindly copy pasted.

DBIx::Class join pruning

DBIx::Class has a join optimiser to prune out any unnecessary joins from the queries it generates.  Occasionally this can cause code that looks good to fail unexpectedly.  The most obvious time this normally happens when using literal SQL.

The most common example of this is when doing an aggregate operation with a group by.  Here I’m trying to sum and count some values on a joined table.  This looks like pretty standard DBIC code, but it fails.  I don’t use the linked table in the group by, just in the columns where it’s my own raw SQL.

sub total
{
    my $self = shift;
    my @columns = $self->result_source->columns;
    my $me = $self->current_source_alias;
    return $self->search(undef, {
        join => ['items'],
        '+columns' => [
            { total => \'sum(amount) as total' },
            { lines => \'count(amount) as lines' },
        ],
        group_by => [map {"$me.$_"} @columns],
    });
}

Order->totals->as_subselect_rs->search({
        total => { '>' => 10 }
    },
    {
        '+columns' => ['total', 'lines']
    }
)->all;

 

DBI Exception: DBD::Pg::st execute failed: ERROR:  column "amount" does not exist
LINE 1: ... (SELECT "me"."id", "me"."name", "me"."type", sum(amount) as...
                                                             ^ [for Statement "SELECT "me"."id", "me"."name", "me"."type", "me"."total", "me"."lines" FROM (SELECT "me"."id", "me"."name", "me"."type", sum(amount) as total, count(amount) as lines FROM "orders" "me" GROUP BY "me"."id", "me"."name", "me"."type") "me" WHERE ( "total" > ? )" with ParamValues: 1='10']

Looking closely at the SQL generated the items alias isn’t joined in.  The optimiser has pruned that join because it couldn’t see any reference to the column.  It turns out that DBIC does really try to do the right thing, it’s just we didn’t give it enough info to give it a hint.

Adding the ‘items.’ qualifier to the column names in the SQL allows DBIC to realise that that join is necessary.

   return $self->search(undef, {
        join => ['items'],
        '+columns' => [
            # NOTE: fully qualified name
            # even though they are not ambiguous.
            { total => \'sum(items.amount) as total' },
            { lines => \'count(items.amount) as lines' },
        ],
        group_by => [map {"$me.$_"} @columns],
    });

With that table alias, not strictly necessary for the SQL since those column names are unambiguous, DBIC realises that it is needs the join.

SELECT "me"."id", "me"."name", "me"."type", "me"."total", "me"."lines"
  FROM (
    SELECT "me"."id", "me"."name", "me"."type", sum( items.amount ) AS total, count( items.amount ) AS lines
      FROM "orders" "me"
      LEFT JOIN "order_items" "items"
        ON "items"."order_id" = "me"."id"
    GROUP BY "me"."id", "me"."name", "me"."type"
   ) "me"
WHERE "total" > '10'

Querying debian/ubuntu packages

The debian package management system is something I’ve been using for years, and it’s served me well for all those years. For the first decade I rarely needed to use more than the common apt-get install or dpkg -i package.deb. Listing the installed packages with dpkg -l has been useful too, but apart from apt-cache search those were all the commands I needed most of the time.

Over the past 5 years I’ve been using slightly more complex queries to figure out information about packages I’ve installed. Working at a company where we are deploying and supporting debian servers on a regular basis has meant it’s been useful to poke about the .deb infrastructure a litle more. In truth most of the time I just use 2 commands, but they seem to get me through most of what I want to do.

If I don’t know which package a program comes from I can use dpkg-query with the -S flag.

$ dpkg-query -S /sbin/ss
iproute2: /bin/ss

To figure out what is provided by a package I can use the -L flag.

$ dpkg-query -L iproute2 | grep etc
/etc
/etc/iproute2
/etc/iproute2/rt_protos
/etc/iproute2/rt_tables
/etc/iproute2/ematch_map
/etc/iproute2/rt_realms
/etc/iproute2/rt_scopes
/etc/iproute2/rt_dsfield
/etc/iproute2/group
/usr/share/man/man8/ip-netconf.8.gz

That’s also useful for checking a package is still fully installed,

for f in `dpkg-query -L python-lxml`; do if [ ! -e $f ]; then echo Missing $f; fi; done

Of course there are other commands that I use periodically, but these are the ones I’ve been using frequently enough to start remembering by heart.

Vim word breaks and perl

I use vim for coding my Perl and in the most recent versions of Ubuntu I’ve not been keen on one of the changes to the syntax defaults.  The word definition has start to include : which means that class names like Blah::Thing count as one word which isn’t how I’m used to working.  Luckily vim is very configurable and the vim irc channel is also really helpful so I was able to find where the configuration was and how to override it quickly.

Finding out that it’s related to the iskeyword configuration was pretty simple, but changing that didn’t have any effect.  It turns out it is set in a Perl syntax specific configuration so overriding it is a little more tricky.

Figuring out where a configuration setting was set is done using :verb set ?

:verb set isk?
iskeyword=@,48-57,_,192-255,:
Last set from /usr/share/vim/vim74/ftplugin/perl.vim

Knowing where that was set I could then override it with an ‘after’ configuration file.

" ~/.vim/after/ftplugin/perl.vim
set iskeyword-=:

Then when I check in vim,

:verb set isk?
iskeyword=@,48-57,_,192-255
Last set from ~/.vim/after/ftplugin/perl.vim

For more of a discussion of the language specific configuration and the after files see this vim wiki entry – http://vim.wikia.com/wiki/Keep_your_vimrc_file_clean

LXC failed to bring up eth0

I came across an issue today where new lxc containers on a machine were failing to get an IP address. In fact it was worse than that, they didn’t appear to have an interface for networking. Just the loopback adaptor.

root@lxc-container:~# ifup eth0
Internet Systems Consortium DHCP Client 4.2.2
Copyright 2004-2011 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Cannot find device "eth0"
Bind socket to interface: No such device
Failed to bring up eth0.

On the host machine brctl didn’t show any interfaces on the bridge.

# brctl show
bridge name	bridge id		STP enabled	interfaces
lxcbr0		8000.56847afe4a99	no		

It turns out the answer was in the machines config file. Usually located in /var/lib/lxc/$machine_name/config.

Normally these contain a section like this defining the network interfaces. In this case it was entirely missing from the configuration file.

lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = lxcbr0
lxc.network.hwaddr = 00:16:3e:99:99:99

Manually adding those settings to the configuration file setup a working eth0 on the machine when it was restarted.

In this case the reason for the missing section was a dodgy setting in the lxc.default_config setting that caused the machine setup to choke.

# cat `lxc-config lxc.default_config`
lxc.lxcpath = /lxc
lxc.network.type = veth
lxc.network.link = lxcbr0
lxc.network.flags = up
lxc.network.hwaddr = 00:16:3e:xx:xx:xx

In this case we had a rogue parameter, lxc.lxcpath that shouldn’t have been in the default_config. There wasn’t any visible error running lxc-create however so we didn’t spot it at first. It was only when we added -x to the shebang line in /usr/share/lxc/templates/lxc-debian that we saw an error,

root@host-machine:~# lxc-create -n nettest3 -t debian
lxc_container: unknown key lxc.lxcpath
lxc_container: Failed to parse config: lxc.lxcpath = /lxc
…

Once we removed the rogue lxc.lxcpath configuration setting new boxes were created as usual again.

Developing against the bleeding edge – Odoo 8.0 (before release)

I’ve been working on a project using Odoo 8.0 which at the time of writing isn’t fully released yet (the Odoo).  These are some notes on working at the bleeding edge as the release stabilises.

Working with a git version that is changing frequently means you need to upgrade your database on a fairly regular basis.  To do that the -u all command to openerp is very handy.

./openerp-server -d mydatabase -u all --stop-after-init --config=/etc/openerp/openerp-server.conf

The one feature we found caught us out initially was with the new reports.  The QWeb and bootstrap 3 were fine, it was the caching that surprised us.  As developers we tend to be trying lots of incremental changes on reports and at first it looked like our changes weren’t working.  Luckily we realised it was actually a new feature, caching.  This can be turned off against the reports assuming you have the access to the technical features.  It’s not on against every report, but invoices are one area it is used.  Partly because the reports are intended to be legal documents that shouldn’t change once they have been printed.

At the moment these are the open bugs on github I’m tracking.  If you need to apply one of the patches to get on with things just stick ‘.patch’ onto the github url and you will have a patch file you can apply in the usual fashion, i.e. patch -p1 < nnnn.patch

Some of these are caused by constant upgrading on the bleeding edge, others are things I hope get fixed before the final cut.  Note that these pull requests may not be the best way to fix the issues, but they do generally fix the symptoms so I have found them useful for getting on with my own development.

Issue 1110

Error context:
View `product.template.form`
[view_id: 259, xml_id: product.product_template_form_view_variant_button, model: product.template, parent_id: 252]

https://github.com/odoo/odoo/issues/1110

Issue 1267 – Fixed

KeyError: 'section_id'

https://github.com/odoo/odoo/pull/1267

Issue 1187

This one is just a tweak to the writeability of a few fields.

https://github.com/odoo/odoo/pull/1187

Issue 1284 – Fixed

XML RPC and None values

https://github.com/odoo/odoo/pull/1284

DBIx::Class Hackathon

I attended the first DBIx::Hackathon this weekend and I want to thank Jess for organising it, and Shadowcat who put a lot of time and effort into the event. Kudos to the sponsors Eligo and CTRL-O who provided backing.

I spent the day trying to improve the ‘where’ feature. It was my first patch to the core DBIx::Class module. Previously, I’ve reported little bugs to the project. Ribasushi, as ever was a great help with explaining what was needed in terms of fixing the issue. In fact he knows the code base so well he practically spoon fed me the solution as it’s just a minor tweak in his mind ;)

I suspect however this patch will never see the light of day. In some ways, I wish I had done more preparation for the hackathon so that I could have discussed it more intelligently in person. However, I’m glad I implemented the patch, because now it makes it easy to prove whether the feature is a good idea or not. One of the nice things about a hackathon is that you can free up a chunk of (relatively) guaranteed time for a project, something I find a lot harder to do in general life.

To try to demonstrate and test the various types of relationship I created a simple project on github. I’ve used DBIx::Class::Candy which is a small sugar layer on top of DBIx::Class so I hope you’ll forgive the slightly simpler syntax for my code. It’s a very trivial contrived 2 table database at the moment that allows me to play with some of the relationships.

My suspicion since I encountered the where feature is that it’s unfinished* and that extending it in a meaningful way will be tricky without causing lots of surprises to existing code that uses it.

The where attribute is described as useful for filtering relationships. As you’d imagine when you use it it sticks a where clause onto the search you do when following the relationship.

package …::Order;
...
has_many delivery_charge => '...::OrderItem', 'order_id',
            { where => { type => 'delivery' } };

In my example I have a relationship that allows me to pick out the delivery charges from the order as they are line items of the type ‘delivery’. Assuming you have fetched an order from the database and you follow the ‘delivery_charge’ accessor you’ll get an OrderItem resultset filtered to the type ‘delivery’.

$order->delivery_charge; # resultset containing order items of type ‘delivery’
# SELECT me.id, me.order_id, me.name, me.type, me.quantity, me.price 
#   FROM order_items me 
# WHERE me.order_id = '1' AND type = 'delivery'

That’s all pretty obvious really.

The problem really becomes obvious when you do a prefetch on the relationship.

my $order = $schema->resultset(‘Order’)->search({ id => 1 }, { 
prefetch => ['delivery_charge']
})->first;
$order->id; # 1
my @delivery_charges = $order->delivery_charge->all; # actually returns all order lines, where clause not applied.

SELECT me.id, me.name, me.active, delivery_charge.id, delivery_charge.order_id, delivery_charge.name, delivery_charge.type, delivery_charge.quantity, delivery_charge.price 
  FROM orders me 
  LEFT JOIN order_items delivery_charge 
    ON delivery_charge.order_id = me.id 

The logical fix is to apply the where clause when joins happen. This is what the patch I created does.

Only where clause restrictions and joins are tricky beasts. There are two fundamental problems.

  1. Conceptually you probably really want the where clause to become a clause in a left outer join. I’ll explain more in a moment.
  2. You need to worry about table aliases. Since you’re now dealing with multiple tables you need to worry about ambiguity.

The first problem is the more serious and, in my opinion, will kill the current patch, and possibly any realistic attempt to fixing this feature.

Lets cover the first point now. The prefetch is a good example of how we want to get a list of orders, and we want the delivery charge items loaded at the same time so that we don’t have to make subsequent database calls. With the where clause being generated a query looks something like this,

SELECT me.id, me.name, me.active, delivery_charge.id, delivery_charge.order_id, delivery_charge.name, delivery_charge.type, delivery_charge.quantity, delivery_charge.price 
  FROM orders me 
  LEFT JOIN order_items delivery_charge 
    ON delivery_charge.order_id = me.id 
WHERE type = 'delivery'

Now consider a query for orders placed for today. When we make that query we expect to see all the orders placed today. If however we have some orders with delivery charges, and some that have none because they were collected in store, guess what happens when that where clause kicks in. It ends up filtering out the orders without any delivery charges. This is an SQL 101 type of problem that you generally solve with a left join and the clause promoted into the join condition. In fact the query generated already half does that, it does just that with the id link, so that the missing items shouldn’t prevent the main thing from being found. It’s just the where clause is in the wrong place. With the join ‘fixed’ you end up being surprised by your results when you join on one a relationship with a where clause.

The second problem will cause things to fail more visibly. Lets consider if we add a type field to our Order class now. A perfectly logical thing, we could add types like ‘store’, ‘internet’, ‘mail-order’. Then the query generated blows up again. You could hard code in a table alias into the where clause but that will fail when you have complex queries and you haven’t predicted the generated table alias correctly. DBIC doesn’t know what to do to automatically fix the reference and there isn’t any special syntax to do so.

I think the first problem is the more serious, and is the reason the feature should be left as it is. The second I think will ensure that it is because it’s likely to visibly blow up legacy code. It’s the legacy code that really makes fixing this awkward. As ribasushi made clear on the day, it would be fixable in a reasonably clear way by making the join condition a sub, but the existing code as it stands will fail if we add it to the join condition. It would be possible to add new semantics that enable newer style behaviour and fall back to the current behaviour when the old style declaration is used, but is it worth it? The where keyname seems like a good idea, except is it really? Assuming you don’t want problem 1 – the overly restricted search results – it won’t actually be a where clause in a join, so ‘where’ doesn’t make so much sense. If we wanted syntactic sugar to make this feature easy to use and just work we’d be better off coming up with a new key and semantics. The ‘where’ feature I would rather leave as a footnote in the documentation as something to avoid in newer code.

The sane way to achieve the same result these days is with the ‘Custom join conditions’ documented in DBIx::Class:Relationship::Base. They allow you to produce arbitrarily complex conditions. Because it takes a sub it becomes easy to deal with the ambiguity issue clearly and it also becomes possible to do far more complex things too.

has_many standard_products => '...::OrderItem',
            sub {         
                my $args = shift;
                return ( 
                    {    
                        "$args->{self_alias}.id" => { -ident => "$args->{foreign_alias}.order_id" },
                        "$args->{foreign_alias}.type" => 'standard',                            
                    }                                                                           
                );                      
            }; # 'Custom join conditions' perldoc DBIx::Class:Relationship::Base

The downside of this compared to the where keyword is that it requires you to specify the whole condition, even the id join, rather than simply bolting on an extra condition. It’s great for allowing you to do anything, but the simple case is a little cumbersome.

See my previous blog post for an example of altering the join at runtime.

* apologies to the person whose baby I am calling ugly.

New Salt version break

I started getting these error messages last week and I discovered it was actually caused by an incompatibility between the new version 2014.01 and an old server 0.17. In general if you’re finding strange errors, particularly with configuration that used to work, double check your version numbers match. You can always do a ‘salt machine test.version’ to check the clients version.

salt machine state.highstate
machine:
----------
	State: - no
	Name:  	states
	Function:  None
    	Result:	False
    	Comment:   No Top file or external nodes data matches found
    	Changes:

Or via salt-call state.highstate

[ERROR   ] Got a bad pillar from master, type bool, expecting dict: False

LXC Networking

My infrequently used laptop appears to have developed some issues with the networking on new lxc containers (Ubuntu 13.10) so I finally started to dig into how it’s setup. I first blew away the caches in /var/cache/lxc, then tried installing a fresh box with a completely stock template. The network interface didn’t acquire one of the usual 10.0.3.0 network ip addresses. A look in the logs suggested the dhcp server was receiving a DHCP request, and it was sending back a lease, but the client wasn’t picking that up.

tail /var/log/syslog
dnsmasq-dhcp[1261]: DHCPDISCOVER(lxcbr0) 00:22:3e:ea:aa:fa 
dnsmasq-dhcp[1261]: DHCPOFFER(lxcbr0) 10.0.3.231 00:22:3e:ea:aa:fa 
dnsmasq-dhcp[1261]: DHCPDISCOVER(lxcbr0) 00:22:3e:ea:aa:fa 
dnsmasq-dhcp[1261]: DHCPOFFER(lxcbr0) 10.0.3.231 00:22:3e:ea:aa:fa 
dnsmasq-dhcp[1261]: DHCPDISCOVER(lxcbr0) 00:22:3e:ea:aa:fa 
dnsmasq-dhcp[1261]: DHCPOFFER(lxcbr0) 10.0.3.231 00:22:3e:ea:aa:fa 

A google suggested my problem was the UDP checksums – bug 1204069. A look using tcpdump suggested that indeed my checksums were corrupt; note the ‘bad udp cksum’ on the second packet.

sudo tcpdump -vvv -i lxcbr0
21:36:34.009788 IP (tos 0x10, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 328)
    0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from 00:22:3e:ea:aa:fa (oui Unknown), length 300, xid 0xb407cc75, secs 49, Flags [none] (0x0000)
	  Client-Ethernet-Address 00:22:3e:ea:aa:fa (oui Unknown)
	  Vendor-rfc1048 Extensions
	    DHCP-Message Option 53, length 1: Discover
	    Parameter-Request Option 55, length 13: 
	      Subnet-Mask, BR, Time-Zone, Default-Gateway
	      Domain-Name, Domain-Name-Server, Option 119, Hostname
	      Netbios-Name-Server, Netbios-Scope, MTU, Classless-Static-Route
	      NTP
	    END Option 255, length 0
	    PAD Option 0, length 0, occurs 41
21:36:34.010027 IP (tos 0xc0, ttl 64, id 44699, offset 0, flags [none], proto UDP (17), length 328)
    10.0.3.1.bootps > 10.0.3.231.bootpc: [bad udp cksum 0x1c2d -> 0x1a4c!] BOOTP/DHCP, Reply, length 300, xid 0xb407cc75, secs 49, Flags [none] (0x0000)
	  Your-IP 10.0.3.231
	  Server-IP 10.0.3.1
	  Client-Ethernet-Address 00:22:3e:ea:aa:fa (oui Unknown)
	  Vendor-rfc1048 Extensions
	    DHCP-Message Option 53, length 1: Offer
	    Server-ID Option 54, length 4: 10.0.3.1
	    Lease-Time Option 51, length 4: 3600
	    RN Option 58, length 4: 1800
	    RB Option 59, length 4: 3150
	    Subnet-Mask Option 1, length 4: 255.255.255.0
	    BR Option 28, length 4: 10.0.3.255
	    Default-Gateway Option 3, length 4: 10.0.3.1
	    Domain-Name-Server Option 6, length 4: 10.0.3.1
	    END Option 255, length 0
	    PAD Option 0, length 0, occurs 8

The bug suggests that that should no longer be a problem, so I carried on googling. Turning my search criteria to DHCP udp problems. That came up with bug 930962 which gave a potential work around and suggested that it should be fixed. Since it said there should already be a firewall fix in the config I decided to take a look at the configuration. To figure out where that was I used dpkg -L

dpkg -L lxc
…
/etc/init/lxc-net.conf
…

After a little digging I found the network configuration in /etc/init/lxc-net.conf where the setup of all the networking out of the box that I so like about ubuntu was all there. There was no mangle line as suggested in the bug report. At that point I checked the version I was running, and the version on the bug report and realised that was a pretty recent bug fix, and I simply don’t have it. Rather than figure out where the fixed package is, I figured I may as well just try to fix it myself for now since I’ve come so far.

sudo iptables -t mangle -A POSTROUTING -o lxcbr0 -p udp --dport bootpc -j CHECKSUM --checksum-fill

After that the networking came up, or more specifically the server obtained an ip address and I was able to ssh to it. I added pretty much that line into my /etc/init.lxc-net.conf and it now works out of the box again and I have a better understanding of how lxc sets up it’s network. I’ve also finally solved a case of checksum offload issues for myself rather than just hearing about them. Possibly in the strangest way, i.e. not turning off the network offload of the checksums.

Of course the adventure didn’t end there. I then found that salt wasn’t working. When I looked for salt keys I didn’t find any requests from new servers. A tcpdump showed a similar checksum issue with the salt traffic. A quick hack with iptables again solved that problem.

sudo iptables -t mangle -A POSTROUTING -o lxcbr0 -p udp --dport 4505 -j CHECKSUM --checksum-fill
sudo iptables -t mangle -A POSTROUTING -o lxcbr0 -p udp --sport 4505 -j CHECKSUM --checksum-fill
sudo iptables -t mangle -A POSTROUTING -o lxcbr0 -p udp --dport 4506 -j CHECKSUM --checksum-fill
sudo iptables -t mangle -A POSTROUTING -o lxcbr0 -p udp --sport 4506 -j CHECKSUM --checksum-fill

Testing out uninstall to fix a perl modules dependency issue

Just recently after an upgrade to some CPAN modules I started getting this crash on one of my machines when the Catalyst::View::JSON was loaded.

#     Error:  Couldn't instantiate component "TestApp::View::JSON", "Recursive inheritance detected in package 'Types::Serialiser::BooleanBase' at (eval 1547) line 76."Compilation failed in require at (eval 4) line 2.

The actual source of that error appears to be JSON::XS rather than Types::Serialiser::BooleanBase or TestApp::View::JSON.

I didn’t investigate the error properly, or really fix it properly. Instead I tested out one of the newer features of cpanm, the -U uninstall flag. I simply uninstalled JSON::XS and hey presto, no more crash.

cpanm -U JSON::XS
JSON::XS contains the following files:

  /home/colin/perl5/perlbrew/perls/perl-5.14.4/bin/json_xs
  /home/colin/perl5/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/x86_64-linux/JSON/XS.pm
  /home/colin/perl5/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/x86_64-linux/JSON/XS/Boolean.pm
  /home/colin/perl5/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/x86_64-linux/auto/JSON/XS/XS.bs
  /home/colin/perl5/perlbrew/perls/perl-5.14.4/lib/site_perl/5.14.4/x86_64-linux/auto/JSON/XS/XS.so
  /home/colin/perl5/perlbrew/perls/perl-5.14.4/man/man1/json_xs.1
  /home/colin/perl5/perlbrew/perls/perl-5.14.4/man/man3/JSON::XS.3
  /home/colin/perl5/perlbrew/perls/perl-5.14.4/man/man3/JSON::XS::Boolean.3

Are you sure you want to uninstall JSON::XS? [y] y

That probably warrants some explanation. The new Catalyst::Runtime now appears to pull in the new alternative to JSON::XS, Cpanel::JSON::XS so this can now be used instead, and so things just worked. It’s probably a bit drastic a solution for most systems at the moment, I’m sure it will demonstrate any places where I have direct dependencies on JSON::XS. On my development box that should be handy however. I’d rather be using a single library for that single purpose.