This is a quick brain dump from my brief coding session on PAUSE working on the permissions. This won’t make sense without reading the code yourself. Unless you’re interested in modifying PAUSE or reviewing my changes I would ignore this blog post. I’m writing this for my future self as much as anyone. The behaviour I modified in my pull request is in italics.
The code for setting the permissions when a new distribution is indexed is largely in
PAUSE::package. It’s triggered by
PAUSE::mldistwatch, but the code that actually sets the permissions is in
perm_check checks that the current user has permission to be uploading this distro and bails if not. If this is a new package (and they do have permission) then it also adds the user to the
perms table as they are effectively first come.
give_regdowner_perms adds people to the
perms table where necessary. This was supporting the module permissions in amongst other things. It is now also where we copy the permissions from the main module.
checkin method as well as adding the record to the packages table also adds to the
primeur table via the
checkin_into_primeur function. If the
x_authority flag is used then that user is added, otherwise the main module is looked up, or failing that the uploading user.
Note that the first come (primeur) users appear in both the
There are special checks for Perl distributions in the permissions code that will change the behaviour of the permissions. I am purposely not mentioning them as I never took the time to understand them.
A quick note on testing. As well as the tests which work of the dummy distributions in the corpus directory there is a test utility. To get a look at the permissions database in practice use the
build-fresh-cpan script. This will build a one off cpan environment that you can examine. Just pass it a package to index and then you can check the permissions database.
$ perl one-off-utils/build-fresh-cpan corpus/mld/009/authors/O/OO/OOOPPP/Jenkins-Hack-0.14.tar.gz
Initialized empty Git repository in /tmp/h4L3GU6Awi/git/.git/
cpan db git Maildir pause.log run
$ cd db
$ sqlite3 mod.sqlite
SQLite version 18.104.22.168 2014-10-29 13:59:56
Enter ".help" for usage hints.
sqlite> select * from primeur;
sqlite> select * from perms;
When you start using Docker one of the things it’s quite possibly you’ve hit is output buffering. Docker encourages you to log to stdout/err from your program, and then use docker to feed your regular logging solution.
This generally exhibits itself as you seeing no logging, then after the program has been running for a while you come back to the logs and find they are all there. It doesn’t get as much press as caching but it can be just as head scratching.
With Perl for example the buffering kicks in when it’s not connected to a terminal. For that this somewhat idiomatic snippet can be useful,
select( ( select(\*STDERR), $|=1 ) );
select( ( select(\*STDOUT), $|=1 ) );
As ever with Perl there is more than one way to do it of course…
Note that you could experience this sort of thing when using other logging setups. If you pipe to
logger to output to rsyslog you could experience the same issues.
Note that this blog post mostly assumes you’re operating on Linux. If you’re using Windows just use Fiddler. It probably does everything you need. Actually, having just looked at their page it looks like it may well work on lots of places other than Windows too, so it might be a good option.
When developing programs that consume API’s that make use of HTTP at some level it’s often useful to check what is actually going over the wire. Note that this is talking about unencrypted traffic at this level. If you’re talking to a server over HTTPS you will need to MITM your connection or get the keys for the session using the
SSLKEYLOGFILE environment variable.
The simplest way to capture traffic is generally to use tcpdump. Wireshark is often a good tool for looking at network traces, but for lots of HTTP requests it tends to feel clunky. This is where I turn to a python utility named pcap2har. This converts a packet capture to a HAR file. A HAR file is essentially a json file containing the HTTP requests. It has an array of the requests with each request noting the headers and content of the parts of the request/response. The HAR file format is documented here.
You will actually find that Google Chrome allows you to export a set of requests from it’s Network tab of the Developer toolbar as a HAR file too.
The pcap2har utility isn’t packaged in a particularly pythonic way, and it doesn’t actually extract the request body so I created a minor tweak to it on a branch. This branch does extract the request body which is often used in API calls. You’ll need to pip install dpkt, the rest of the dependencies are bundled in the repo. Then you run it like this,
git clone https://github.com/OpusVL/pcap2har.git --branch request_body
sudo pip install dpkt
tcpdump port 8069 -w packets.dump
pcap2har packets.dump traffic.har
Having said that HAR is a lot easier to consume, there are viewers, but I’ve not found one that I particularly liked. I tend to either pretty print the json and look at it in a text editor, or then use grep or code to extract the information I want.
For OpenERP/Odoo API calls I created a quick script to explode the contents of the API calls out to make it easier to read. It explodes the xml/json contents within the requests/responses out to json at the same level as the rest of the HAR data, rather than having encoded data encoded within json.