Watching them watching me

On and off for the past two years, I have been investigating anti-counterfeiting measures in banknotes, in particular the Counterfeit Detection Systems (CDS). At the request of the Central Banks Counterfeit Deterrence Group (CBCDG), this software was added to scanner and printer drivers, as well as to image manipulation packages, in order to detect images of currency and prevent them from being processed.

I wrote a webpage on some experiments I ran on the CDS and gave a talk presenting the results of reverse engineering. Unsurprisingly this drew the attention of the involved parties, and while none of them contacted me directly, I was able to see them in my web logs. In September 2005, I first noticed Digimarc, who developed the CDS, followed a few hours later by the European Central Bank and the US Treasury (both CBCDG members), suggesting Digimarc tipped them off.

However none of these paid as much attention as the Bank of England (also a CBCDG member) who were looking at my pages several times a week. I didn’t notice them for a while due to their lack of reverse DNS, but in December I started paying attention. Not only was their persistence intriguing, but based on referrer logs their search queries indicated a particular interest in me, e.g. Project Dendros Steven Murdoch (Dendros is one of my research projects).

Perhaps they just found my work of interest, but in case they had concerns about my research (or me), I wanted to find out more. I didn’t know how to get in contact with the right person there, so instead I rigged my webpage to show visitors from either of the Bank of England’s two IP ranges a personalised message. On 9 February they found it, and here is what happened…

The first visitor went to my homepage after a Google search for Steven Murdoch Cambridge University, but instead found my message. It was only three paragraphs long, but after 43 seconds the visitor closed my webpage, without clicking any links. I am guessing a personalised webpage was surprising enough to make this user give up on their original task.

The second visitor came 25 minutes later, and didn’t send a referrer, so this could be via a bookmark or a link copied and pasted from an email. This time only 7 seconds were spent on the message before continuing to my original homepage. However after 8 seconds this visitor went back to the message, as if doing a double take in disbelief, before closing the window 2 seconds later.

Our third visitor was another 5 hours later, but instead of going to my homepage, went directly to the “Continue” link on the message page, but without loading the message. It might be the same person as before, but this user didn’t have a cookie set, so I suspect one of the previous visitors emailed that link to this user. This visitor spent over 10 minutes on my homepage, but didn’t seem to follow any links.

The final visitor came the following day, but not directly. Instead he or she used Google to search for notes cambridge "steven murdoch", and then requested the Google cache. My website was up and running at that time, so I presume this visitor wanted to hide their request. However the Google cache is not designed with privacy in mind, so external references like stylesheets still get through. I saw this behaviour from the Bank of England before, but didn’t have the logs to explain it fully.

I also note that the Bank of England IT system uses a Blue Coat proxy. This adds a header X-Bluecoat-Via, which is used to prevent looping. Another anomalous header is Novinet, produced by the Novell NetIdentity client, used to perform authentication on an intranet. I do think that it is a minor security risk to give away information on LAN infrastructure to random web servers.

There is also some unusual pre-fetching behaviour, I presume from the Blue Coat proxy. Although based on the user-agent, the HTML pages are downloaded by IE on Windows, other embedded content, such as images and stylesheets, has a spoofed user-agent and missing referrer. It seems that the proxy server requests pre-requisites before the browser has rendered the page, in order to speed up the download. Finally, I pointed out that two IP addresses in separate netblocks are in use (Bank-Of-England and BANKENGL01). I initially thought these were different departments, but it now seems that they are used interchangeably, and I have observed switching between them while downloading a single webpage.

So in conclusion, you can learn a lot from web logs, Google cache does not make you invisible (use Tor instead), and do not underestimate the dedication to procrastination of a PhD student writing up. Finally, the Bank of England didn’t visit again, at least from the IP ranges I know about, and didn’t contact me either. Perhaps I scared them off?

9 thoughts on “Watching them watching me

  1. after 43 seconds the visitor closed my webpage, without clicking any links.

    How do you know about that part? I don’t think it’s something you can guess from the logs.

  2. @Anonymous

    I added Javascript which started a timer; then when the page is left an onUnload event handler submits the results to my webserver. In my tests it was not called every time the page was closed, but when it was, the timing was reasonably accurate.

  3. That is interesting, I only wish the log file stat analyzer used by my webhost was better. Another reason I want to go back to hosting myself, I am in firm belief there is no better log file analyzer then AWSTATS.

Leave a Reply to Steven J. Murdoch Cancel reply

Your email address will not be published. Required fields are marked *