Comments on: Sampled Traffic Analysis by Internet-Exchange-Level Adversaries

By: Clive Robinson

Clive Robinson — Tue, 05 Jun 2007 10:37:00 +0000

There is a paper from Washington Uni presented at USENIX07 on how consumer devices leak information about you.

One such device they highlight is the Sling Media Slingbox Pro, which streams video across a home network.

They have showed how simple traffic analysis can be used to work out which movie is being streamed even when encryption is enabled due to using “more efficient” transmission methods (variable Bitrate Encoding in this instance).

http://www.cs.washington.edu/research/security/usenix07devices.pdf

By: Clive Robinson

Clive Robinson — Sat, 02 Jun 2007 17:58:58 +0000

Richard,

I agree that all the suggestions on their own will have deficiencies and I certainly do not belive there is or will be a “one shot silver bullet solution”.

As you say the usual rule of traffic analysis… and you will note I qualified dummy traffic with a “realistic”.

All the open litrature proposals I have looked at in the past have always made the mistake of trying to optomise in some way to try and preserve performance. Which experiance has shown will always leave a hook or two (clasic example is secure crypto algorithms such as AES/RSA/etc etc that when optimised for performance leak key info by timming difference due to cache hits etc).

Essentialy the problem is that “if” you allow the traffic to look “different” in some respect for the sake of “efficiency” or “low latency” or a host of other reasons then you leave it open to attack.

However in millitary networks the usuall premise is dummy traffic always looks the same from all sides. That is it is always (link) encrypted and packet sizes etc are always the same size and packet times and rates etc are fixed. This does not give the hostile organisation a hook to hang their hat on and get down to work (or atleast that is the theory 8).

Therefore if a proposal only fixes one asspect of the traffic differentiation then it obviously will not be effective.

The trade off is like being a passenger on a motor bike or on a train. The motor bike gives freedom in time and place high rates of acceleration good fuel economy etc etc but how safe are you compared to a modern train on a well organisaed and run network (France Germany etc)?

If all the individual sugestions I gave (and one or two others) are combined in a thoughtfull manner then although not perfect the system will go a long way to solving the “known” problems.

As you then say you build and test a real world system and revise as required untill you do remove the hooks that you and others can currently find. Then you allow the time for people to attack the solution, which then usually moves the body of knowledge on as new methods of attack become apparent (FEAL being an example).

The current body of open knowledge on traffic analysis can be compared to the state of the open knowledge of cryptography ten years after DES came into use.

If people are negative about traffic analysis prevention techniques then the body of open knowledge will not move forward because nobody will bother (think number factoring prior to RSA).

However apart from the moving forward of the “body of knowledge” the real question is are “real users” going to accept the limitations in efficiency etc to gain the increase in security that this method gives?

I suspect that the razor about security-v-usability also applies for security-v-performance, ie you can have security or performance but not both. Which would support the argument that “security always has costs” and “are people prepared to pay the cost”?

However you will also note that I very specificaly avoided the hostile entity being a part of the network as this will certainly allow any and all traffic analysis protection to be stripped away using one of many many techniques.

By: Richard Clayton

Richard Clayton — Thu, 31 May 2007 16:42:24 +0000

Clive's suggestion of synchronised data sending is essentially Wei Dai's PipeNet proposal which dates back to 1995. As to dummy traffic ... well the anonymity literature is full of hand-waving proposals for such traffic; and attacks on the schemes that are fleshed out enough to permit others to study them. The usual rule of traffic analysis "It's hard to make one thing look like another" continues to apply in spades!

By: Clive Robinson

Clive Robinson — Tue, 29 May 2007 21:34:33 +0000

@Alex,

Sorry in my earlier post I was a bit brief in my reply so,

Let us assume a simple model of the system in terms of points the adversary (hostil organisation) can monitor but not read as it is encrypted,

1, User TOR
2, TOR Destination
3, TOR TOR

The user wants to communicate to the destination and the TOR network is used to hopefully make the communication anonymous

If you assum that the hostile organisation can see two or more of those points then they can determin due to the low latency of the TOR network if the trafic in the paths are corelated.

In TEMPEST work the first couple of things you get taught is Bandwidth and Energy, both have to be present in sufficient quantities for communication to take place if either is insufficient then communications cannot take place.

When you get to understand that they start teaching you about cross-modulation where one communications signal carries another on top of it (an inadvertant or covert channel).

All of these have their analoges in the network domain where Energy can be related to the number of packets sent, Bandwidth to channel capacity and cross modulation to several things but timing jitter on packets is one.

Let us assume that the hostile organisation can control the network between the user and the TOR (User TOR) they can put timing delays on the network packet to modulate the data stream (think about Spread Spectrum Communications or Digital Water Marking etc).

If they suspect you are talking to a particular destination and they can monitor that link (TOR Destination) then your trafic can easily be detected by cross corelating to find the effective Digital Watermark they super imposed onto the users network packet timing.

Worse they can do the opposit which is to modulate all traffic from the destination into the TOR if they use the equivalent of the JPL Ranging Codes then they can find all people using the Destination site simply by monitoring all the TOR outputs. Unfortunatly this attack would work simultaniously for all data streams on the link which is just plain nasty.

This rather nasty attack can be fairly easily stoped. In TEMPEST they tell you repeatadly “clock not just your inputs but your outputs” this prevents most timeing attacks by removing the cross modulation at the input or the output.

So if all the TOR nodes where time synchronised and only sent data out at agreead time intervals the hostile organisation would find it difficult (but not impossible) to make this kind of attack.

You could then think of the TOR network like a PipeLined CPU the latency increases but so does the throughput. Alternativly think of it as a digital delay line for the covert channel the more the delays the lower the bandwidth available to the channel.

The down side is obviously the fine dividing line between latency and the sisze of message transmitted. Obviously at some point a large enough message will allow the hostile organisation to watermark the message at a given rate of latency. So the bigger the message the greater the latency required which is a double whamy…

Another technique that can be used is for the TOR to use a low latency between nodes but randomly delay a network packet for n periods. Lets us say it does this randomly with one in ten packets and the TOR depth is a minimum of ten nodes then the expected normal distrubution of the latency is going to prove somewhat unhelpfull to the hostile organisation. However it will average out for the user which gives another area for trade off.

However that does not help if the amount of trafic on the TOR is low, no matter how much timing jiggerling the TOR network does if yours is the only traffic through it then the hostile organisation has an easy time of finding both ends of the communication path.

This is where fixed rate communications is usually used where dummy traffic insertion is the usual method of ensuring the fixed rate. If all nodes send exactly ten packets in a given time period then the hostile organisation will find it difficult to spot the dummy trafic from the real traffic. Obviously the current web browser clients would not like this but it would be quite easy to arange with Ajax type techneiques (one for the programers to get their teeth into).

Obviously dummy trafic takes up quite a bit of bandwidth which would appear as a a waste especialy within the TOR network (TOR TOR) paths. But does it nee to be dummy traffic?

No if the TOR network was used to carry two types of traffic one for which latency was important (Web) and one for which it was not (EMail). Then the dummy trafic could be replaced with the taffic that did not require low latency, and only when that was scarce would dummy traffic be inserted.

This then leaves another problem which is demand. What happens when the demand exceeds a certain threshold so that the web traffic became sufficiently dominent that the cross modulation attack became possible again at the given packet rate. Well the simple solution is to make the packet rate variable in fixed size steps and have the whole TOR network switch up and down as appropriate.

This should give you a flavour of what is possible for both the TOR network to help anonymous communications and what the hostile organisation has to do to fingerprint traffic through the TOR network.

By: Clive Robinson

Clive Robinson — Tue, 29 May 2007 13:50:14 +0000

Oddly I made comment to Richard about choke points and mix networks back in october last year (see comment 2).

http://www.lightbluetouchpaper.org/2006/09/23/random-isnt-always-useful/

I guess this partialy answers some of the issues I raised.

By: Clive Robinson

Clive Robinson — Tue, 29 May 2007 13:31:15 +0000

Alex,

No it would not be fool proof as the adversary (being effectivly omnipitant 😉 could still cross corelate against time, and they can always timestamp enumerate the TOR network to then build a latency map.

The only solution to the problem is to compleatly de couple time and size related information not just between the input and output of the network but inbetween nodes as well which although it can be done has issues such as latency.

Oh and you also need a reasonable level of realistic / dummy traffic in,out and through the TOR network as well for the real messages to be able to hide in.

By: Alex

Alex — Mon, 28 May 2007 15:21:27 +0000

Isn’t what we need something like – splitting the connection across multiple nodes, with each user also being a router node? It could be faster, and more secure. You could call it “TorRent” 😉