Setting up a new CPAN Mini Mirror

It’s time for me to set up a new dev laptop and I wanted to set up a CPAN Mirror quickly. Using a mirror has been something I’ve done for a long time now, it’s great for coding without internet. Unless you’re doing lots of module installs this is likely to use more bandwidth than downloading on an ad hoc basis, but the tradeoff is that I can install almost any module I realise I need regardless of whether I have connectivity or not. The only limitations will be down to external dependencies on things like operating system packages/tools.

I have previously setup mirrors using lxc and salt to provision the machine, but this time I decided to convert that set up to Docker. I also simplified the setup to just the mirror as I didn’t need to inject modules into the CPAN server any more. Well, in truth I never really did on my laptops, I simply did that because it was useful for work at the time. I created the salt configuration to make it easier to re-provision new servers for a work setup that allowed for private mirror that also had work modules allowing for a full CPAN type deployment process of both public and private code.

To do this I’ve used docker-compose, that’s almost always the best option for a laptop setup. Even with a single machine it generally makes life simpler as you can encode all the configuration in the file so that you have a few simple consistent commands to build/setup/run your containers. I’ve also set it to the highest version number currently available partly just to see what’s available, and partly because I want to make use of some of the newer features of volumes. It doesn’t appear that I can express everything I want to perfectly in the docker-compose file, but I suspect that they aren’t really targeting my situation. While docker is partly really popular because it works really well on developer laptops, cached file system layers and lightweight machines work really well in a constrained environment, fundamentally docker is aiming for servers and so they’re trying to deal with issues of sharing resources across multiple machines rather than working on a single machine.

What I really wanted to be able to do was specify the exact details of the volume I wanted shared between the containers, from the location down to the user id’s for them. If that’s currently fully possible I haven’t seen a way to do it.

This setup is not designed for the general internet or with security in mind (not that a simple mirror really should really have much in the way of threats). I don’t even expose the ports, just print out the url for use when running cpanm. I also just update the server manually rather than setup a cron job for it as I don’t really want to use that much bandwidth on this. I don’t use the mirror that often, but when I do it’s really valuable, even if it’s not all the latest and greatest versions of the modules.

Having all the modules locally can also be great when you want to do some analysis of what existing modules are doing. It’s reasonably easy to write scripts to say find all the XS modules and then extract their C code to see which call a particular function.

The configuration is on github here, The modules downloaded are kept in a volume outside the containers so updating/removing etc. should be easy. In theory it should even be possible to wrap the set up around an existing mirror if you already have files.