Debugging Web API traffic

Note that this blog post mostly assumes you’re operating on Linux. If you’re using Windows just use Fiddler. It probably does everything you need. Actually, having just looked at their page it looks like it may well work on lots of places other than Windows too, so it might be a good option.

When developing programs that consume API’s that make use of HTTP at some level it’s often useful to check what is actually going over the wire. Note that this is talking about unencrypted traffic at this level. If you’re talking to a server over HTTPS you will need to MITM your connection or get the keys for the session using the SSLKEYLOGFILE environment variable.

The simplest way to capture traffic is generally to use tcpdump. Wireshark is often a good tool for looking at network traces, but for lots of HTTP requests it tends to feel clunky. This is where I turn to a python utility named pcap2har. This converts a packet capture to a HAR file. A HAR file is essentially a json file containing the HTTP requests. It has an array of the requests with each request noting the headers and content of the parts of the request/response. The HAR file format is documented here.

You will actually find that Google Chrome allows you to export a set of requests from it’s Network tab of the Developer toolbar as a HAR file too.

The pcap2har utility isn’t packaged in a particularly pythonic way, and it doesn’t actually extract the request body so I created a minor tweak to it on a branch. This branch does extract the request body which is often used in API calls. You’ll need to pip install dpkt, the rest of the dependencies are bundled in the repo. Then you run it like this,

git clone https://github.com/OpusVL/pcap2har.git --branch request_body
cd pcap2har
sudo pip install dpkt
tcpdump port 8069 -w packets.dump
pcap2har packets.dump traffic.har

Having said that HAR is a lot easier to consume, there are viewers, but I’ve not found one that I particularly liked. I tend to either pretty print the json and look at it in a text editor, or then use grep or code to extract the information I want.

For OpenERP/Odoo API calls I created a quick script to explode the contents of the API calls out to make it easier to read. It explodes the xml/json contents within the requests/responses out to json at the same level as the rest of the HAR data, rather than having encoded data encoded within json.

Advertisements

Hand Coding SQL with psycopg2 for Odoo

I thought it would be a good idea to write down some guidelines on writing hand coded SQL for Odoo/OpenERP. One of the general themes should be ‘avoid SQL injection’, and Odoo themselves provide some documentation explaining that. They also make the point before that you should try to stick with the ORM as much as possible too. Rolling your own SQL bypasses all the access controls on records, so if you are writing some SQL, always consider whether the user is supposed to have access to the data extracted or manipulated by the SQL.

Assuming you’re still set on writing SQL it’s worth noting that Odoo uses psycopg2, a mature and respectable library for accessing Postgres from Python. Dealing with parameters in SQL clauses is simple as it allows you to pass them through as parameters, just as you’d hope.

cr.execute('select * from res_partner where name = %s', (partner_name,))

Unfortunately it’s too simplistic to say never use string concatenation. In practice you often want to stitch together a complex query based on some data. You might want to vary the search criteria based on user input.

Here are some concrete scenarios to demonstrate more complex situations. Read the psycopg2 documentation for how to pass in values, and how it converts the various types first.

In clause

This is the most common place people give up with parameterisation. It’s clearly documented in Psycopg2, but since people often don’t read the docs, they don’t realise %s works and so they end up doing something involving join and their own format strings. The syntax is really simple,

cr.execute('select * from res_partner where name in %s', (['Partner 1', 'Partner 2'],))

Optional clauses

The trick with building up optional parts is to maintain lists of parameters alongside lists of the bits of SQL you need. The exact mechanism for appending the strings isn’t terribly important, and in fact you might find it useful to write yourself a nice class to make all that string construction simple and reliable. The key thing is to never mix untrusted data into the SQL you are generating.

clauses = []
params = []
if partner_name:
    clauses.append('name = %s')
    params.append(partner_name)
if website:
    clauses.append('website = %s')
    params.append(website)
where_clause = ""
if len(clauses) > 0:
    where_clause = "where " + ' and '.join(clauses)
   
sql = "select id, name from res_partner " + where_clause
cr.execute(sql, params)

Refactoring out chunks of SQL generation

If you find yourself requiring reuse with snippets of SQL or subqueries it’s quite common to refactor that out into a method. This is another common place for mistakes. Return SQL and a list of parameters whenever you do this, rather than just a block of SQL.

def stock_level(product_ids):
    sql = """
    select location_dest_id as location_id, product_uom, sum(product_qty) as qty
    from stock_move 
    where product_id IN %s
    and state = 'done'
    group by  location_dest_id, product_uom
    union all 
    select location_id, product_uom, 0 - sum(product_qty) as qty
    from stock_move 
    where product_id IN %s
    and state = 'done'
    group by  location_id, product_uom
    """
    return (sql, [product_ids, product_ids])

snippet, params = stock_level(product_ids)
sql = """
select location_id, product_uom, sum(qty) as stock_level
from (
    %s
) as stock
group by location_id, product_uom
having sum(qty) > 0 ;
""" % snippet
cr.execute(sql, params)

Using the same parameter twice

The previous example uses the same parameter twice in the query, and passes it in twice. This can be avoided by using a named arguments like %(name)s. This can also be useful when you have a very large query requiring lots of parameters as it allows you to pass in a dictionary and not worry about the ordering of the arguments you pass it.

def stock_level(product_ids):
    sql = """
    select location_dest_id as location_id, product_uom, sum(product_qty) as qty
    from stock_move 
    where product_id IN %(product_id)s
    and state = 'done'
    group by  location_dest_id, product_uom
    union all 
    select location_id, product_uom, 0 - sum(product_qty) as qty
    from stock_move 
    where product_id IN %(product_id)s
    and state = 'done'
    group by  location_id, product_uom
    """
    return (sql, {"product_id": product_ids})

snippet, params = stock_level(product_ids)
sql = """
select location_id, product_uom, sum(qty) as stock_level
from (
    %s
) as stock
group by location_id, product_uom
having sum(qty) > 0 ;
""" % snippet
cr.execute(sql, params)

Fixing badly refactored code

If you do find yourself with a function that is returning SQL only, and you can’t change the function prototype because external code is depending on it, psycopg2 provides the mogrify function that can help. It does require the cursor to be in scope.

def stock_level(cr, uid, product_ids):
    sql = """
    select location_dest_id as location_id, product_uom, sum(product_qty) as qty
    from stock_move 
    where product_id IN %s
    and state = 'done'
    group by  location_dest_id, product_uom
    union all 
    select location_id, product_uom, 0 - sum(product_qty) as qty
    from stock_move 
    where product_id IN %s
    and state = 'done'
    group by  location_id, product_uom
    """
    return cr.mogrify(sql, [product_ids, product_ids])

snippet = stock_level(cr, uid, product_ids)
sql = """
select location_id, product_uom, sum(qty) as stock_level
from (
    %s
) as stock
group by location_id, product_uom
having sum(qty) > 0 ;
""" % snippet
cr.execute(sql)

Allowing the caller to specify fields/tables

This is something where the libraries generally don’t help. You need to sanitize this code yourself. In general try to avoid fully untrusted output. It’s better to have a whitelist of what you expect to allow, and allow them to select from those known good tables/fields. Also remember that you can query the database for what the structure of the database is if you don’t know it when writing the code. If you really can’t check either, sanitize the input heavily. Try to allow the absolute minimum characters within identifiers and also consider using an extra qualifier so that you can identify programmatically created fields/tables and prevent name clashes with core database functions.

In summary

Keep your string manipulation safe, never append untrusted input (note that data you retrieved from the database should generally be treated as untrusted). Assuming you are doing it right, any query without parameters should look start to look suspicious. There are of course simple queries that take no parameters, but they should be easy to verify as safe.

Note that the code is not from a live system and may not actually work (it was typed up for this post and not tested). It’s designed to illustrate the principles rather than be blindly copy pasted.

Developing against the bleeding edge – Odoo 8.0 (before release)

I’ve been working on a project using Odoo 8.0 which at the time of writing isn’t fully released yet (the Odoo).  These are some notes on working at the bleeding edge as the release stabilises.

Working with a git version that is changing frequently means you need to upgrade your database on a fairly regular basis.  To do that the -u all command to openerp is very handy.

./openerp-server -d mydatabase -u all --stop-after-init --config=/etc/openerp/openerp-server.conf

The one feature we found caught us out initially was with the new reports.  The QWeb and bootstrap 3 were fine, it was the caching that surprised us.  As developers we tend to be trying lots of incremental changes on reports and at first it looked like our changes weren’t working.  Luckily we realised it was actually a new feature, caching.  This can be turned off against the reports assuming you have the access to the technical features.  It’s not on against every report, but invoices are one area it is used.  Partly because the reports are intended to be legal documents that shouldn’t change once they have been printed.

At the moment these are the open bugs on github I’m tracking.  If you need to apply one of the patches to get on with things just stick ‘.patch’ onto the github url and you will have a patch file you can apply in the usual fashion, i.e. patch -p1 < nnnn.patch

Some of these are caused by constant upgrading on the bleeding edge, others are things I hope get fixed before the final cut.  Note that these pull requests may not be the best way to fix the issues, but they do generally fix the symptoms so I have found them useful for getting on with my own development.

Issue 1110

Error context:
View `product.template.form`
[view_id: 259, xml_id: product.product_template_form_view_variant_button, model: product.template, parent_id: 252]

https://github.com/odoo/odoo/issues/1110

Issue 1267 – Fixed

KeyError: 'section_id'

https://github.com/odoo/odoo/pull/1267

Issue 1187

This one is just a tweak to the writeability of a few fields.

https://github.com/odoo/odoo/pull/1187

Issue 1284 – Fixed

XML RPC and None values

https://github.com/odoo/odoo/pull/1284

Salt service running

If you’re using salt to ensure you’re running your services and you constantly see your service being started it might be because the service doesn’t support the status command.

----------
    State: - service
    Name:      openerp
    Function:  running
        Result:    True
        Comment:   Started Service openerp
        Changes:   openerp: True

If you tweak the salt minion to log trace messages you’ll see that salt calls service openerp status.

/etc/salt/minion:
log_level: trace

# service salt-minion restart

/var/log/salt/minion
2013-10-02 19:26:32,602 [salt.loaded.int.module.cmdmod][INFO    ] Executing command 'service openerp status' in directory '/root'
2013-10-02 19:26:32,611 [salt.loaded.int.module.cmdmod][INFO    ] Executing command 'service openerp start' in directory '/root'

If we try that on the command line we’ll see that the openerp service doesn’t support the status option.

root@openerp-aq:~# service openerp status
Usage: openerp-server {start|stop|restart|force-reload}

That means salt gets an error and assumes the service isn’t running so it starts it.

The service state has an alternative way to check for the service running by looking at ps and grepping for the process. This is done by specifying the sig key.

openerp:
  service:
    - running
    - sig: openerp
    - require:
      - pkg: openerp

Now we finally see what we want, salt leaving the service running.

----------
    State: - service
    Name:      openerp
    Function:  running
        Result:    True
        Comment:   The service openerp is already running
        Changes:   

In the log we now see,

2013-10-02 19:31:15,371 [salt.loaded.int.module.cmdmod][INFO    ] Executing command "ps -efH | grep 'openerp' | grep -v grep | awk '{print $2}'" in directory '/root'

A few tips on using the OpenERP XML RPC API

I’ve been developing applications that talk to OpenERP via it’s API for a little while now and I figured it would be worth noting down some general pointers for using it. The API is used by all the UI clients for OpenERP so it definitely allows you to do everything that they do. It’s also got a reasonable amount of access to the data layer so you can do an awful lot via it. From simply creating interfaces to it, to streamlining a process with a slick UI, there are a lot of possibilities with it.

Here are my tips for developing with it,

  1. Run OpenERP with the --log-request and --log-response command line flags to see how OpenERP achieves it’s tasks. It’s also helpful to see what your application is sending.
  2. Don’t be afraid to read the code. Watching OpenERP is a great starting point, but sometimes you need to look at the code to understand what a parameter is for, or how it’s getting the data.
  3. Don’t be afraid to use the debugger. Putting --debug on the command line allows the python debugger to kick in when there is an exception. It also allows you to stick ‘import pdb; pdb.set_trace()’ onto a line in the code you want to investigate and drop into there in the debugger.
  4. Pass the context like you are told to, it makes life easier if you are setup to simply pass it along with your calls (but also be ready to add data to it). The context contains things like the language which is used for translations. You want your applications language to be consistent with any users of OpenERP itself otherwise things will get weird. Also note that some calls require extra information to be passed in via it. It’s not a completely opaque blob that you simply pass about between calls.
  5. Limit the columns you ask for. When you make a read call you can specify which columns you are interested in. Do that from the start or you’ll end up with performance problems later on. Once you have a partner with 1000’s of orders and invoices etc. a simple read of the res.partner will take a significant chunk of time if you aren’t limiting what you read from it.
  6. Don’t be afraid to extend OpenERP. Even if you aren’t a Python developer by trade, if you’re doing serious data modification you’re better off creating a module and calling a method on that than making lots of API calls. The main reason for that is to get everything into a single transaction. The other reason is speed.
  7. Be careful about filtering by calculated fields, they generally get turned into a ‘TRUE’ statement in the SQL. This can really screw things up when you have OR conditions in your filter. Use the --log-sql if you’re unsure what is happening.
  8. False is used for Null/None in the XML RPC
  9. Returning a dictionary from an OpenERP method via the API requires the keys to be strings. i.e you can’t simply return { id: quantity }. You’d need to do { str(id): quantity } to prevent a crash. None is also a no-no for dictionary values. Convert them to False if you want to be consistent with OpenERP.
  10. Formatting of numbers to n decimal places is largely done client side. OpenERP provides you all the info you need, but you need to deal with that yourself.
  11. Beware of gapless ir.sequence sequences in batch jobs. They have a high potential for causing contention.

OpenERP debugging tip – turn off cron

While we’ve been doing a lot of OpenERP deployment we have been discovering various ways to configure it and one turns out to be very handy for debugging. If you’ve ever debugged OpenERP using the --debug flag and dropped into the debugger you have probably noticed the system carries on doing things while you’re sat at the debugger prompt. Often obliterating what you were looking at. This generally happens because OpenERP generally runs with multiple threads out of the box, and some of those threads do the ‘cron’ jobs, the background tasks, so even if you haven’t tried to do anything on the website, there will be activity. If you want to prevent the cron activity from making your debugging session more confusing than it needs to be add the --max-cron-threads=0 flag when you run OpenERP.

Hooking into the OpenERP ORM

I’ve been hooking into the OpenERP ORM layer of a few of the models to add full text search via an external engine. Thanks to the way OpenERP is structured that appears to be quite a reliable approach. As I was doing it I found that I wanted a common set of hooks on several models. Doing that suggested I should refactor the hooks to a mixin or a base class. After playing about with my OpenERP module I’ve come to the conclusion that creating a base class seems to be the most reliable way to hook the methods. The one trick you need to be aware of is the _register flag that you want to set to False for your base class.

class search_base(osv.osv):

    _register = False

    def write(self, cr, user, ids, vals, context=None):
       success = super(search_base, self).write(cr, user, ids,
                                                    vals, context)
       … do stuff
       return success


class product_template_search(search_base):
    _name = "product.template"
    _inherit = "product.template"
    _register = True


class product_search(search_base):
    _name = "product.product"
    _inherit = "product.product"
    _register = True

product_search()
product_template_search()

Without that you’ll end up with the orm whining that the search_base class has no _name attribute as it tries to register it as a model.

openerp.osv.orm: The class search_base has to have a _name attribute
openerp.netsvc: ValueError
The class search_base has to have a _name attribute
> /usr/lib/pymodules/python2.7/openerp/osv/orm.py(967)__init__()
-> raise except_orm('ValueError', msg)

Adding new reports to OpenERP

Creating new modules for Open ERP is pretty simple. Here is how to package up some new reports into a module.

In this example I’ve created another report to use with sales orders. This is based on the sales order report so in actual fact I use some of the code and xml from that original report to build it up, then tweak them to produce the collection docket I want. I’m not actually going to reproduce the report here, since that’s a trivial thing to customise. The interesting thing is really how the module is packaged together and how to spot mistakes in the packaging.

The files/folders in the zip

module/__openerp__.py                # this contains the module info
module/report/collection_docket.rml  # the report
module/report/sale_order.py          # sets up the report parser
module/report/__init__.py            # just loads the code
module/reports.xml                   # registers the report
module/__init__.py                   # loads the code

__openerp__.py

{
   'name': 'Extra Sales Reports',
   'version': '0.01',
   'category': 'Extra reports for sales',
   'description': """
   The extra sales reports needed for our project,

   * Reports
     - Collection Docket

   """,
   'author': 'OpusVL',
   'website': 'http://www.opusvl.com',
   'depends': ['stock', 'procurement', 'board', 'sale'],
   'init_xml': [],
   'update_xml': [
       'reports.xml',
   ],
   'demo_xml': [],
   'test': [],
   'installable': True,
   'active': False,
}

__init.py__

import report

reports.xml

<?xml version="1.0" encoding="utf-8"?>
<openerp>
    <data>
        <report auto="False" id="collection_docket" model="sale.order" name="sale.collection_docket" rml="module/report/collection_docket.rml" string="Collection Docket" />
    </data>
</openerp>

report/__init__.py

import sale_order

report/sale_order.py

from report import report_sxw
import time


# this bit is basically a copy of the stuff
# in the regular sale order module.
# I can't just import that code because it causes
# the sale.order report to get registered again
# causing it to complain.
# otherwise I’d do this - from addons.sale.report import order 
class order(report_sxw.rml_parse):
    def __init__(self, cr, uid, name, context=None):
        super(order, self).__init__(cr, uid, name, context=context)
        self.localcontext.update({
            'time': time,
        })


report_sxw.report_sxw('report.sale.collection_docket', 'sale.order', 'addons/module/report/collection_docket.rml', parser=order, header="external")

report/collection_docket.rml

<?xml version="1.0"?>
<document filename="Sale Order.pdf">
... this is a copy of the addons/sale/report/sale_order.rml customised as necessary.

The module is then zipped up for distribution in a regular .zip file. This can either be imported directly into OpenERP or it can be unzipped manually into the addons directory.

Installation of the module via the OpenERP client

  1. Go to Administration->Modules,
  2. Select Import module and select the zip [1].
  3. Select the module and mark it for install.
  4. Now restart OpenERP server.
  5. Now go back to the client and schedule the install of the module.

If you install your module this way you will actually find that the module is left in it’s zip file and the OpenERP server simply reads the files from the zip as if they were an extension of the addons directory.

Manual installation

  1. Find the addons directory
  2. Unzip the module into it.
  3. Restart the openerp server
  4. Go to Administration -> Modules
  5. Select ‘Update Module List’.
  6. Find the module and schedule it for install.

Use

The report can now be used programmatically using the standard report method and referencing it as sale.collection_docket. Alternatively you’ll find that a button has appeared on the sales order screen in the OpenERP client that allows you to print the collection docket alongside the button for printing the regular sales order report.

Trouble shooting

ZipImportError: bad local file header in /usr/share/pyshared/openerp-server/addons/myextra_reports.
zip

This normally indicates it is time to restart the server. If you have just imported the zip file of the module and tried to install it straight away you will often get this error.

ERROR:web-services:[01]: Exception: Report /usr/share/pyshared/openerp-server/addons/myextra_report/report/collection_docket.rml doesn’t exist or deleted :

This is generally caused by a typo in the parser python file where the report is registered with the report_sxw.report_sxw call.

ERROR:service:This service does not exist: ‘report.sale.collection_docket’
ERROR:web-services:[07]: KeyError: ‘report.sale.collection_docket’

The report hasn’t been registered. Has your module been installed and loaded against the current database? Remember that installing the module into the OpenERP server and installing it against your current database are two seperate steps.

ERROR:web-services:[01]: Exception: Start tag expected, ‘<' not found, line 1, column 1

This can be caused by a bad filename registered using the xml file. The path to the file should be in relation to the addons path. In other words, if the full path is /usr/share/pyshared/openerp-server/addons/module/report/docket.rml, the relative filename you need is module/report/docket.rml.
http://www.openerp.com/forum/topic26308.html

The alternative cause of this problem is that there is a unicode BOM indicator at the start of the file. One of the support entries appears to indicate that that will cause the parser to dislike the document.

https://bugs.launchpad.net/openobject-server/+bug/694409

[1] If you get a permissions error that’s normally because the addons
directory isn’t writeable by the user that openerp is running as.

Further reading

The OpenERP documentation regarding reports.