Grepping for a single, arbitrary character in a bunch of files.

I had a random error complaining about being unable to read a source file without any real explanation of which file was the problem.

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 25: ordinal not in range(128)

The most logical way to rip through a number of files is find. On linux xxd is the simplest tool for hex dumping and grep can then look at it’s output. The problem occurs when you realise you can’t use | in an -exec with find, or at least I haven’t figured out how. The simplest way to work around that is to put your command into a tiny shell script and -exec that. Obvious really, but it always takes me a moment to remember that.

Note that this is for grepping for a single byte. Grepping for multiple bytes would require a different approach entirely.

cat > bingrep.sh
#!/bin/sh
xxd -g 1 $2 | grep -i "\<$1\>" -q
^d
chmod +x bingrep.sh
find . -type f -name "*.py" -exec ./bingrep.sh c3 {} \; -print

That allows me to look for the rogue 0xc3 characters in the source code I was dealing with.

I suspect I ought to be able to do something similar with regex’s, but you run the risk of stupid interpretation problems. Sometimes it’s simpler to just look at it in the raw, relatively speaking.

A few tips on using the OpenERP XML RPC API

I’ve been developing applications that talk to OpenERP via it’s API for a little while now and I figured it would be worth noting down some general pointers for using it. The API is used by all the UI clients for OpenERP so it definitely allows you to do everything that they do. It’s also got a reasonable amount of access to the data layer so you can do an awful lot via it. From simply creating interfaces to it, to streamlining a process with a slick UI, there are a lot of possibilities with it.

Here are my tips for developing with it,

  1. Run OpenERP with the --log-request and --log-response command line flags to see how OpenERP achieves it’s tasks. It’s also helpful to see what your application is sending.
  2. Don’t be afraid to read the code. Watching OpenERP is a great starting point, but sometimes you need to look at the code to understand what a parameter is for, or how it’s getting the data.
  3. Don’t be afraid to use the debugger. Putting --debug on the command line allows the python debugger to kick in when there is an exception. It also allows you to stick ‘import pdb; pdb.set_trace()’ onto a line in the code you want to investigate and drop into there in the debugger.
  4. Pass the context like you are told to, it makes life easier if you are setup to simply pass it along with your calls (but also be ready to add data to it). The context contains things like the language which is used for translations. You want your applications language to be consistent with any users of OpenERP itself otherwise things will get weird. Also note that some calls require extra information to be passed in via it. It’s not a completely opaque blob that you simply pass about between calls.
  5. Limit the columns you ask for. When you make a read call you can specify which columns you are interested in. Do that from the start or you’ll end up with performance problems later on. Once you have a partner with 1000’s of orders and invoices etc. a simple read of the res.partner will take a significant chunk of time if you aren’t limiting what you read from it.
  6. Don’t be afraid to extend OpenERP. Even if you aren’t a Python developer by trade, if you’re doing serious data modification you’re better off creating a module and calling a method on that than making lots of API calls. The main reason for that is to get everything into a single transaction. The other reason is speed.
  7. Be careful about filtering by calculated fields, they generally get turned into a ‘TRUE’ statement in the SQL. This can really screw things up when you have OR conditions in your filter. Use the --log-sql if you’re unsure what is happening.
  8. False is used for Null/None in the XML RPC
  9. Returning a dictionary from an OpenERP method via the API requires the keys to be strings. i.e you can’t simply return { id: quantity }. You’d need to do { str(id): quantity } to prevent a crash. None is also a no-no for dictionary values. Convert them to False if you want to be consistent with OpenERP.
  10. Formatting of numbers to n decimal places is largely done client side. OpenERP provides you all the info you need, but you need to deal with that yourself.
  11. Beware of gapless ir.sequence sequences in batch jobs. They have a high potential for causing contention.