Debugging Web API traffic

Note that this blog post mostly assumes you’re operating on Linux. If you’re using Windows just use Fiddler. It probably does everything you need. Actually, having just looked at their page it looks like it may well work on lots of places other than Windows too, so it might be a good option.

When developing programs that consume API’s that make use of HTTP at some level it’s often useful to check what is actually going over the wire. Note that this is talking about unencrypted traffic at this level. If you’re talking to a server over HTTPS you will need to MITM your connection or get the keys for the session using the SSLKEYLOGFILE environment variable.

The simplest way to capture traffic is generally to use tcpdump. Wireshark is often a good tool for looking at network traces, but for lots of HTTP requests it tends to feel clunky. This is where I turn to a python utility named pcap2har. This converts a packet capture to a HAR file. A HAR file is essentially a json file containing the HTTP requests. It has an array of the requests with each request noting the headers and content of the parts of the request/response. The HAR file format is documented here.

You will actually find that Google Chrome allows you to export a set of requests from it’s Network tab of the Developer toolbar as a HAR file too.

The pcap2har utility isn’t packaged in a particularly pythonic way, and it doesn’t actually extract the request body so I created a minor tweak to it on a branch. This branch does extract the request body which is often used in API calls. You’ll need to pip install dpkt, the rest of the dependencies are bundled in the repo. Then you run it like this,

git clone --branch request_body
cd pcap2har
sudo pip install dpkt
tcpdump port 8069 -w packets.dump
pcap2har packets.dump traffic.har

Having said that HAR is a lot easier to consume, there are viewers, but I’ve not found one that I particularly liked. I tend to either pretty print the json and look at it in a text editor, or then use grep or code to extract the information I want.

For OpenERP/Odoo API calls I created a quick script to explode the contents of the API calls out to make it easier to read. It explodes the xml/json contents within the requests/responses out to json at the same level as the rest of the HAR data, rather than having encoded data encoded within json.