I was going to end at part 2 but the recent shock news story about the iPhone and location data has made some of the other things I was thinking about when I took the android apps apart seem more relevant.
One of the things I mentioned about the adverts was that they were ‘location aware’. This of course means your phone sends information about your location to the ad server. It’s obviously impractical to send all ads for all locations to the phone, so the phone sends it’s location to the server and it serves up the relevant ads.
The other thing ad servers need to know is how many unique views there are of the ads. This again requires more data to go to the ad server. For Android devices this normally appears to be done by the ‘ANDROID_ID’. This is a random number generated by the device when it first boots and is supposed to remain constant for the lifetime of the device. The ad library I was looking at further anonymized this by md5 hashing it before sending it to their servers. So that’s a random number that’s then hashed before being sent. Not too bad really!
Of course how anonymous is anonymous data? It really depends on who’s looking at it.
The ad company isn’t able to determine who was where and when. That number was fairly meaningless after the md5 hash and was pretty meaningless before. There is one scenario where the data can be linked though. If you have the original data from the phone and the data from the ad company you can tie up the records. With that it’s possible to demonstrate that the phone believed it was somewhere when it connected to the ad server and when. The main place I can see that being used is by law enforcement and other government agencies.
The anonymous as with most things is in degrees. Forensic analysis of data after the fact as ever is a lot simpler than attempting to spot things in real time.
Of course I should probably mention this has nothing to do with the recent iPhone/iPad fun and games. Even if the iPhone does send your location to their servers regularly it probably isn’t even sending over that hashed device id because they don’t need to know about unique visitors (or worry about tracking visitors) for building up the location database.
Actually, there may be another way your data may not be properly anonymous to those looking at the data collected on the ad server. The server technically has the ip and port you connected from so if they store that, and in a way that it can be cross referenced with the location of the ad hit, then someone could decide to find out the identity of a person from that information along with where they were. Again that probably requires due process.
Should you be scared? Only as much as usual.