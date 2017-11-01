I’m at IMC 2017 at Queen Mary University of London, and will try to liveblog a number of the sessions that are relevant to security in followups to this post.
3 thoughts on “Internet Measurement Conference”
Vasileos Giotsas kicked off with a talk on inferring BGP blackholing carried out by ASes to deal with DDoS traffic. This has been going on for years but there’s little open data on who does it, how and for whom. The AS of a target (the blackhole user) can use a BGP update to notify others of a blackholed prefix under RFCs 6535 and 7999, so they can drop traffic. The blackhole provider is not just the upstream ASes but IXPs that provide interfaces for their community. However communities are not standardised. Vasileos used data from four BGP collectors round the world, peered with over 2,700 ASes, assembling the data for each blackhole event. Over the past three years, the number of providers has doubled from about 40, and over 160,000 prefixes have been blackholed in the period, with visible spikes for events ranging from the Turkish coup to Mirai. Providers include both IXPs and transit providers; their services are most popular among content providers and hosters, often serving ephemeral or low-ranked domains.
Romain Fontugne was next on pinpointing delay and forwarding anomalies. When a customer complains to their ISP that a website is too slow, the usual approach is to use traceroute, ping and operator mailing lists – which being manual is slow, expensive and doesn’t scale. Romain’s solution is to exploit the existing measurement data from an existing public source, the RIPE Atlas. Particularly useful are “builtin”, which sends a traceroute every 30 minutes from 500 servers to all DNS root servers, and “anchoring” which sends traceroutes every 15 minutes to 189 collaborating services. The core of the work is finding smart ways to analyse noisy data, traffic asymmetry and packet loss. With enough probes, you can monitor the health of many intermediate links in real time; in fact the public data suffice to monitor hundreds of thousands of links in real time with no extra burden on the network. In addition, Romain has packet loss models to spot router failure and a health metric for ASes. Case studies include the June 2015 Telecom Malaysia route leak which affected 170,000 prefixes, and related congestion in Level 3 in London.
Yves Vanaubel is working on Tracking Invisible MPLS Tunnels. Routing data collected by firms like Caida are often inaccurate because opaque MLPS clouds made up of many invisible tunnels lead to naive scans inferring a small number of nodes with an unrealistically large degree (e.g., over 128). How can routes be inferred more accurately? Yves has developed two techniques: direct path revelation, for networks not using MLPS internally, and mostly relevant to Juniper devices; and recursive path revelation for networks using all MPLS, mostly relevant to Cisco. In the former you try to run a trace to an internal prefix and see if routers reveal themselves; in the latter, you try to run a trace to the egress router, and try internal prefixes. He’s run experiments on PlanetLab with 91 nodes, selecting high-degree nodes from the Caida dataset, and mapping the length of invisible tunnels found.
Amogh Dhamdhere is working on inferring Internet congestion. This is a complex subject, tied up with commercial manoevring (as in the Comcast/Netflix pairing negotiations); how can we get good data? M-lab started using throughput-based measurements based on NDT data in 2014-15, and Amogh has been thinking about the best methodology to use. Simple network tomography – inferring congested links from end-to-end throughput data – is harder than it looks; see for example the previous paper. But various things can be done In the top five US ASes, the client and server were not more than one hop away in 82% or more of the NDT tests. As for link diversity, the M-lab server in Atlanta (hosted by Level 3) found 1 or 2 links to Comcast and Frontier, but 14 to AT&T. He also uses bdmap to identify interconnects between ASes, but only a small fraction of them may be testable by M-lab; speedtest.net covers more as it thousands of servers rather than hundreds.
Rodérick Fanou is investigating the causes of congestion in African IXPs. Recent work in South Africa and elsewhere has shown that consumers in Africa often don’t get the advertised broadband performance. Roderick believes his is the first congestion study in the continent. He has been deploying vantage points in five countries and did time-series latency probes to perform network tomography for a year till February 2016-7. He validated his results via IXP operator interviews. Sustained congestion cases include GIXA, hosted by Ghanatel; this was providing free transit to a content network through a 100Mbps link, which was congested, while serving its own paying customers through a 1Gpbs link which was not. Another ISP, Netpage, ended up paying for an upgrade so its customers could get Google traffic without congestion. Roderick concludes that the IXP ecosystem is highly dynamic in Africa, so longitudonal measurement and monitoring would be valuable. In questions, someone remarked that Google should put their servers at the exchange rather than at the monopoly telco.
Amogh Dhamdhere spoke again, on iTCP Congestion Signatures. Conventional speed tests don’t tell us much about the nature of congestion; it can be self-induced, as with last-mile flows, or did it hit an already-congested link upstream? These induce different TCP retransmit behaviour after all. He’s found that self-induced congestion can be detected by looking at the covariance, and the difference between the maximum and the minimum, of the round-trip time. These can be fed into a classifier. For example, a strong correlation between throughput and TSLP latency suggests that congestion is external. He has done controlled experiments in a testbed to validate the method, getting 100% accuracy in detecting self-induced congestion and 75% for the external variety.
The morning’s last speaker was Qiao Zhang, talking about measuring data-centre microbursts. He designed a high-resolution counter collection framework that can sample at 25 microseconds while keeping sampling loss below 1%, and deployed it at Facebook. He sampled a random 2-minute sample per hour over 30 racks for 24 hours. He defined a microburst as when utilisation goes over 50% for a short period. He looked at three levels: web, cache and hadoop, and found that the 50% threshold could have been as much as 80% with little change. Median inter-burst time is about a millisecond; the bursts themselves are correlated with an-app behaviour such as scatter-gather. The directionality is down for web and hadoop (because of fan-in), but up for cache (as answers are bigger than queries).
Franziska Lichtblau has been studying attacks that rely on spoofed IP addresses. She’s been collecting data on actual spoofed interdomain traffic. She first had to figure out how to detected spoofing efficiently, which she does by parsing ASes’ prefix lists; rather than using the existing Caida “customer cone”, which doesn’t account for peering relationships, she developed her own system called Fulcrum to do this better. It’s not trivial because of multi-AS organisations, hidden AS relationships, and stray traffic. She tuned her system to be conservative, with a low false positive rate. Applying this to a large European ISP, she found 0.012% of the bytes were spoofed – much of them being trigger traffic for DDoS attacks, which give rise to much larger total traffic volumes. She also found that 30% of IXP members don’t filter traffic at all. As for who’s being spoofed, it’s mostly Chinese ISPs, and the traffic is mostly going to NTP servers.
Mattijs Jonker was next, explaining how a third of the Internet is under attack. He’s been characterising the DDoS ecosystem, which has grown hugely of late with booter services doing DDoS for hire for a few dollars. In order to understand targets, attacks and protection services, he uses the UCSD network telescope (a /8 darknet), amplification honeypots (which offer large amplification, to attract attackers), active DNS measurement services and analyses of protection offerings. Two years’ data, till 2/2017, they saw almost 21m attacks, or 30,000 a day, targeting a third of the actively used /24 blocks; about half targeted http and a further 20% https. A total of 210m web sites were attacked over the two years, or about 2/3 of the total. The traffic peaks correspond to attacks on the big hosters. Finally, there are a lot of attacks targeting web infrastructure.
Zhongjie Wang has been looking at evading stateful Internet censorship. Like a standard network intrusion detection system, a censorship system may be vulnerable to hacks that mislead it about TCP and other state. Zhongjie has been playing with the Great Firewall of China and has got a 95% success rate with his techniques. The basic idea is to desynchronise TCP states and program states. He has insertion packets that are accepted by the GFW but dropped by the server, and evasion packets the other way round. Some of their packets were designed after careful study of the behaviour of the Linux and Windows kernels. However the GFW has been getting smarter, and now resynchronises on seeing multiple SYN or multiple SYN/ACK packets – or even SYN/ACK with an incorrect ACK number. Then it updates its SEQ number. It’s hard to the GFW to be fully immune to such attacks but its black-box nature and changing behaviour make measuring it hard.
Fangfan Li works on exposing (and avoiding) traffic-classification rules. There has been much work on obfuscation, domain fronting and the like, but such methods can be fragile to changes of classification methods. Fangfan’s project, “liberate”, is about automatically detecting the classification rules, characterising them, and selecting evasion techniques from a suite; these are tried iteratively until come combination works, or the repertoire is exhausted. Detection uses the “record and replay” mechanism he presented at IMC 15. The basic evasion techniques are match and forget, split and reorder, and flushing the classifier state. H found, for example, that he could defeat the great firewall by cache flushing, but the required delay varied by time of day from 50 seconds to 250 seconds.